[HACKERS] Dead code in Create/RenameRole() after RoleSpec changes related to CURRENT/SESSION_USER
Hi, I found some dead code in CREATE/RENAME ROLE code path. Attached patch to remove those. We have introduced RoleSpec and handled public and none role names in grammar itself. We do have these handling in CreateRole() and RenameRole() which is NO more valid now. Here is the related commit: commit 31eae6028eca4365e7165f5f33fee1ed0486aee0 Author: Alvaro Herrera alvhe...@alvh.no-ip.org Date: Mon Mar 9 15:41:54 2015 -0300 Allow CURRENT/SESSION_USER to be used in certain commands ... Thanks -- Jeevan B Chalke Principal Software Engineer, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c index 3b381c5..d232f52 100644 --- a/src/backend/commands/user.c +++ b/src/backend/commands/user.c @@ -311,13 +311,6 @@ CreateRole(CreateRoleStmt *stmt) errmsg(permission denied to create role))); } - if (strcmp(stmt-role, public) == 0 || - strcmp(stmt-role, none) == 0) - ereport(ERROR, -(errcode(ERRCODE_RESERVED_NAME), - errmsg(role name \%s\ is reserved, - stmt-role))); - /* * Check the pg_authid relation to be certain the role doesn't already * exist. @@ -1159,13 +1152,6 @@ RenameRole(const char *oldname, const char *newname) (errcode(ERRCODE_DUPLICATE_OBJECT), errmsg(role \%s\ already exists, newname))); - if (strcmp(newname, public) == 0 || - strcmp(newname, none) == 0) - ereport(ERROR, -(errcode(ERRCODE_RESERVED_NAME), - errmsg(role name \%s\ is reserved, - newname))); - /* * createrole is enough privilege unless you want to mess with a superuser */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE POLICY and RETURNING
On 8 June 2015 at 16:53, Stephen Frost sfr...@snowman.net wrote: I definitely don't like the idea of returning a portion of the data in a RETURNING clause. Returning an error if the RETURNING clause ends up not passing the SELECT policy is an interesting idea, but I do have doubts about how often that will be a useful exercise. Another issue to note is that, currently, SELECT policies never cause errors to be returned, and this would change that. True. Before UPSERT was added, it was the case that USING clauses from all kinds of policies didn't cause errors, and CHECK clauses did, but that has now changed for UPDATE, and I don't think it's necessarily a problem to change it for SELECT, if we decide that it makes sense to apply SELECT policies to rows returned by RETURNING clauses. There was discussion about a VISIBILITY policy instead of a SELECT policy at one point. What I think we're seeing here is confusion about the various policies which exist, because the USING policy of an UPDATE is precisely its VISIBILITY policy, in my view, which is why I wasn't bothered by the RETURNING clause being able to be used in that case. I recall working on the documentation to make this clear, but it clearly needs work. Yeah, perhaps there's scope for improving the documentation, regardless of whether or not we change the current behaviour. One place that we could add additional documentation is the pages describing the INSERT, UPDATE and DELETE commands. These currently each have a paragraph in their main description sections describing what privileges they require, so perhaps we should add similar paragraphs describing what RLS policies are checked, if RLS is enabled on the table. The primary use-case for having a different VISIBILITY for UPDATE (or DELETE) than for SELECT is that you wish to allow users to only modify a subset of the rows in the table which they can see. I agree that the current arrangement is that this can be used to allow users to UPDATE records which they can't see (except through a RETURNING). Perhaps that's a mistake and we should, instead, force the SELECT policy to be included for the UPDATE and DELETE policies and have the USING clauses from those be AND'd with the SELECT policy. That strikes me as a much simpler approach than applying the SELECT policy against the RETURNING clause and then throwing an error if it fails, I don't think that quite addresses the concern with RETURNING though because the resulting clauses would only be applied to the old (pre-update) data, whereas a RETURNING clause might still return new data that you couldn't otherwise see. though this would remove a bit of flexibility in the system since we would no longer allow blind UPDATEs or DELETEs. I'm not sure if that's really an issue though- to compare it to our GRANT-based system, the only blind UPDATE allowed there is one which goes across the entire table- any WHERE clause used results in a check to ensure you have SELECT rights. That's not quite the extent of it. Column-level privileges might allow you to SELECT the PK column of a table, and then you might be able to UPDATE or DELETE specific rows despite not being able to see their entire contents. I can see cases where that might be useful, but I have to admit that I'm struggling to think of any case where the row-level equivalent would make sense (although that, by itself is not a good argument for not allowing it). Ultimately, we're trying to provide as much flexibility as possible while having the system be easily understandable by the policy authors and allowing them to write simple policies to enforce the necessary constraints. Perhaps allowing the open-ended USING clauses for UPDATE and DELETE is simply making for a more complicated system as policy authors then have to make sure they embed whatever SELECT policy they have into their UPDATE and DELETE policies explicitly- and the result if they don't do that correctly is that users may be able to update/delete things they really shouldn't be able to. Now, as we've discussed previously, users should be expected to test their systems and make sure that they are behaving as they expect, but we don't want to put landmines in either or have ways which they can be easily tripped up. Well I'm all for keeping this as simple and easily understandable as possible. Right now the USING clauses of UPDATE policies control which existing rows can be updated and the CHECK clauses control what new data can go in. That's about as simple as it could be IMO, and trying to merge SELECT policies with either of those would only make it more complicated, and hence make it more likely for users to get it wrong. OTOH, when you tag a RETURNING clause onto the UPDATE, it's kind of like tying an UPDATE and a SELECT together, and it does make sense to me that the new rows returned by the SELECT part of the command be constrained by any SELECT policies on the table.
Re: [HACKERS] Cancel race condition
Ah, OK - I wasn't aware that cancellation was actually delivered as a regular POSIX signal... You're right about the lack of guarantees then. In that case, I'm guessing not much could be do to guarantee sane cancellation behavior... I do understand the best effort idea around cancellations. However, it seems different to say we'll try our best and the cancellation may not be delivered (no bad consequences except maybe performance), and to say we'll try our best but the cancellation may be delivered randomly to any query you send from the moment you send the cancellation. The second makes it very difficult to design a 100% sane, deterministic application... Any plans to address this in protocol 4? Would you have any further recommendations or guidelines to make the situation as sane as possible? I guess I could block any new SQL queries while a cancellation on that connection is still outstanding (meaning that the cancellation connection hasn't yet been closed). As you mentioned this wouldn't be a 100% solution since it would only cover signal sending, but better than nothing? On Tue, Jun 9, 2015 at 1:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: Shay Rojansky r...@roji.org writes: My questions/comments: - Does PostgreSQL *guarantee* that once the connection used to send the cancellation request is closed by the server, the cancellation has been delivered (as mentioned by Tom)? In other words, should I be designing a .NET driver around this behavior? The signal's been *sent*. Whether it's been *delivered* is something you'd have to ask your local kernel hacker. The POSIX standard appears to specifically disclaim any such guarantee; in fact, it doesn't even entirely promise that a self-signal is synchronous. There are also issues like what if the target process currently has signals blocked; does delivery mean that the signal handler's been entered, or something weaker? In any case, Postgres has always considered that query cancel is a best effort thing, so even if I thought this was 100% portably reliable, I would not be in favor of promising anything in the docs. regards, tom lane
[HACKERS] The Future of Aggregation
It appears to me that there's quite a few new features and optimisations on the not too distant horizon which will require adding yet more fields into pg_aggregate. These are things along the lines of: 1. Parallel Aggregation (computes each aggregate state in parallel worker processes and then merges these states in serial mode) 2. Aggregate push-down / Aggregate before join (requires passing partially computed aggregate states between executor nodes) 3. Auto-updating Materialized views (ones which contain aggregate functions) 4. Foreign table aggregation 5. Dependant Aggregates (I talked about earlier here - http://www.postgresql.org/message-id/CAKJS1f8ebkc=EhEq+ArM8vwYZ5vSapJ1Seub5=fvrrudctf...@mail.gmail.com ) Items 1-4 above I believe require support of Aggregate State Combine Support - https://commitfest.postgresql.org/5/131/ which I believe will need to be modified to implement complex database types to backup our internal aggregate state types so that these types be properly passed between executor nodes, between worker processes and perhaps foreign data wrappers (maybe just postgres_fdw I've not looked into this yet) Item 5 makes items 1-4 a bit more complex as with this item there's opportunity for very good performance improvements by allowing aggregates like AVG(x) also perform all the required work to allow SUM(x) and COUNT(x) to be calculated for free in a query containing all 3 aggregates. I've discussed item 5 off-list with Simon and he mentioned that we might invent a transition state and transition functions which can have parts switched on and off much like how calcSumX2 controls if do_numeric_accum() should calculate sum(x*x) or not. The problem with this is that if we ever want to store aggregate states in an auto-updating materialized view, then this generic aggregate state will have to contain at least 3 fields (to store count(x), sum(x) and sum(x*x)), and those 3 fields would have to be stored even if the aggregate was just a simple count(*). The idea I discussed in the link in item 5 above gets around this problem, but it's a perhaps more surprise filled implementation as it will mean select avg(x),sum(x),count(x) from t is actually faster than select sum(x),count(x) from t as the agg state for avg() will satisfy sum and count too. The purpose of this email is to open the doors for discussion about this so that nobody goes off and develops feature X into a corner and disallows feature Y and so that we end up with the most optimal solution that does not trip us up in the future. I'm interested to hear if Kevin or Amit have had any time to give this any thought before. It would be good to ensure we all have the same vision here. Regards David Rowley -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [CORE] [HACKERS] back-branch multixact fixes 9.5 alpha/beta: schedule
On Tue, Jun 9, 2015 at 3:04 AM, David Gould da...@sonic.net wrote: On Mon, 8 Jun 2015 13:53:42 -0300 Alvaro Herrera alvhe...@2ndquadrant.com wrote: * people with the wrong oldestMulti setting in pg_control (which would be due to a buggy pg_upgrade being used long ago) will be unable to start if they upgrade to 9.3.7 or 9.3.8. A solution for them would be to downgrade to 9.3.6. We had reports of this problem starting just a couple of days after we released 9.4.2, I think. Does this mean that for people with wrong oldestMulti settings in pg_control due to a buggy pg_upgrade being used long ago can fix this by updating to 9.3.9 when it is released? Asking for a friend... If the value is buggy because it is 1 when it should have some larger value, yes, this should fix it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bugfix: incomplete implementation of errhidecontext
On Mon, Jun 8, 2015 at 8:19 PM, Andres Freund and...@anarazel.de wrote: On 2015-06-08 14:44:53 +, Jeevan Chalke wrote: The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed This is trivial bug fix in the area of hiding error context. I observed that there are two places from which we are calling this function to hide the context in log messages. Those were broken. Broken in which sense? They did prevent stuff to go from the server log? I'm not convinced that hiding stuff from the client is really necessarily the same as hiding it from the server log. We e.g. always send the verbose log to the client, even if we only send the terse version to the server log. I don't mind adjusting things for errhidecontext(), but it's not just a bug. Function name itself says that we need to hide the context. And this I assume it means from all the logs/client etc. I said it is broken as these two calls are calling this function with passing TRUE explicitly. But even though I can see the context messages on the client. Anyway, I don't want to argue on whether it is a bug or not. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Jeevan B Chalke Principal Software Engineer, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company
Re: [HACKERS] Information of pg_stat_ssl visible to all users
On Jun 9, 2015 6:00 AM, Michael Paquier michael.paqu...@gmail.com wrote: Hi all, I should have noticed that before, but it happens that pg_stat_ssl leaks information about the SSL status of all the users connected to a server. Let's imagine for example: 1) Session 1 connected through SSL with a superuser: =# create role toto login; CREATE ROLE =# select * from pg_stat_ssl; pid | ssl | version | cipher| bits | compression | clientdn ---+-+-+-+--+-+-- 33348 | t | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 | 256 | t | (1 row) 2) New session 2 with previously created user: = select * from pg_stat_ssl; pid | ssl | version | cipher| bits | compression | clientdn ---+-+-+-+--+-+-- 33348 | t | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 | 256 | t | 33367 | t | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 | 256 | t | (2 rows) Attached is a patch to mask those values to users that should not have access to it, similarly to the other fields of pg_stat_activity. I don't have the thread around right now (on phone), but didn't we discuss this back around the original submission and decide that this was wanted behavior? What actual sensitive data is leaked? If knowing the cipher type makes it easier to hack you have a broken cipher, don't you? /Magnus
Re: [HACKERS] Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file
On Tue, Jun 9, 2015 at 10:56 AM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Jun 9, 2015 at 1:04 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Jun 9, 2015 at 9:09 AM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, May 12, 2015 at 10:42 PM, Andrew Dunstan and...@dunslane.net wrote: Map basebackup tablespaces using a tablespace_map file Windows can't reliably restore symbolic links from a tar format, so instead during backup start we create a tablespace_map file, which is used by the restoring postgres to create the correct links in pg_tblspc. The backup protocol also now has an option to request this file to be included in the backup stream, and this is used by pg_basebackup when operating in tar mode. This is done on all platforms, not just Windows. This means that pg_basebackup will not not work in tar mode against 9.4 and older servers, as this protocol option isn't implemented there. While I was performing the recovery-related tests, I found that there was the case where $PGDATA/tablespace_map was not renamed to *.old at the recovery. Is this harmless and intentional? There shouldn't be any problem because we tablespace_map file only if backup file is present. Sorry if this has been already discussed so far. The steps to reproduce the problem is: 1. start base backup, i.e., call pg_start_backup(). 2. repeat some write transactions and checkpoints until the WAL file containing the checkpoint record that backup_label indicates will be removed. 3. killall -9 postgres 4. start the server and a crash recovery. At this time, the crash recovery fails with the following error messages. 2015-06-09 12:26:54 JST FATAL: could not locate required checkpoint record 2015-06-09 12:26:54 JST HINT: If you are not restoring from a backup, try removing the file .../backup_label. 5. according to the above hint message, remove $PGDATA/backup_label and restart a crash recovery Then you can see that tablespace_map remains in $PGDATA after the recovery ends. The basic idea is that tablespace_map file will be used in case we have to restore from a backup which contains tablespaces. So I think if user is manually removing backup_label, there is no meaning of tablespace_map, so that should also be removed. One way to address this is modify the Hint message suggesting that remove tablespace_map if present. Current : If you are not restoring from a backup, try removing the file \%s/backup_label\ Modify it to: If you are not restoring from a backup, try removing the file \%s/backup_label\ and, if present \%s/tablespace_map\. Or what about removing tablespace_map file at the beginning of recovery whenever backup_label doesn't exist? Yes, thats another way, but is it safe to assume that user won't need that file, I mean in the valid scenario (where both backup_label and tablespace_map are present and usable) also, we rename them to *.old rather than deleting it. Yet another way could be we return error if tablespace_map is present whenever backup_label doesn't exist? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [CORE] [HACKERS] back-branch multixact fixes 9.5 alpha/beta: schedule
On Mon, 8 Jun 2015 13:53:42 -0300 Alvaro Herrera alvhe...@2ndquadrant.com wrote: * people with the wrong oldestMulti setting in pg_control (which would be due to a buggy pg_upgrade being used long ago) will be unable to start if they upgrade to 9.3.7 or 9.3.8. A solution for them would be to downgrade to 9.3.6. We had reports of this problem starting just a couple of days after we released 9.4.2, I think. Does this mean that for people with wrong oldestMulti settings in pg_control due to a buggy pg_upgrade being used long ago can fix this by updating to 9.3.9 when it is released? Asking for a friend... -dg -- David Gould 510 282 0869 da...@sonic.net If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file
On 06/08/2015 11:19 PM, Amit Kapila wrote: On Tue, Jun 9, 2015 at 12:27 AM, Andrew Dunstan and...@dunslane.net mailto:and...@dunslane.net wrote: On 06/08/2015 11:16 AM, Amit Kapila wrote: I have to retry that operation, but for me unlink hasn't deleted the file on Windows, may be I am not doing properly, but in anycase why we want to throw error for such a case, why can't we just ignore and create a symlink with the same name. 1. You realize that in Windows postgres, unlink is actually pgunlink(), right? See port.h. If your experiments weren't using that then they weren't testing the same thing. Yes, I know that and was using the same version, but the small problem in my test was that the file name that is used for unlink was not same as that of actual file in directory, sorry for the noise. 2. If the unlink fails and the file is still there (i.e. pretty much everything except the ENOENT case) then creation of the symlink is bound to fail anyway. I realize our existing code just more or less assumes that that it's a symlink. I think we've probably been a bit careless there. I agree with you that deleting unrelated file with the same name as symlink is not the right thing to do, but not sure throwing error for such a case is better either. What else would you suggest? I think Robert and Alvaro also seems to be inclined towards throwing error for such a case, so let us do that way, but one small point is that don't you think that similar code in destroy_tablespace_directories() under label remove_symlink: should use similar logic? Yes, probably. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] could not adopt C locale failure at startup on Windows
We've seen several reports of $SUBJECT lately. I have no idea what's going on, but it occurred to me that it could be informative to tweak init_locale() so that it reports which category failed to be set. Any objections to squeezing that into today's releases? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not adopt C locale failure at startup on Windows
On 2015-06-09 11:20:06 -0400, Tom Lane wrote: We've seen several reports of $SUBJECT lately. I have no idea what's going on, but it occurred to me that it could be informative to tweak init_locale() so that it reports which category failed to be set. Any objections to squeezing that into today's releases? No objection, +1. Seems fairly low risk. The error seems odd. The only even remotely related change between 9.4.1 and .2 seems to be ca325941. Could also be a build environment change. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Aggregate Supporting Functions
Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kgri...@ymail.com writes: David Rowley david.row...@2ndquadrant.com wrote: [ avoid duplicate calculations for related aggregates ] From the information you have proposed storing, with cost factors associated with the functions, it seems technically possible to infer that you could run (for example) the avg() aggregate to accumulate both but only run the final functions of the aggregates referenced by the query. That seems like an optimization to try hard to forget about until you have at least one real-world use case where it would yield a significant benefit. It seems premature to optimize for that before having the rest working. Actually, I would suggest that you forget about all the other aspects and *just* do that, because it could be made to work today on existing aggregate functions, and it would not require hundreds-to-thousands of lines of boilerplate support code in the grammar, catalog support, pg_dump, yadda yadda. That is, look to see which aggregates use the same transition function and run that just once. I was responding to David's suggestion that this particular query be optimized by using the transition function from avg(): SELECT sum(x), count(x) from bigtable; Reviewing what you said caused me to notice something that I had missed before -- that sum() and avg() share a transition function. Still, that function is not used for count(), so I don't see how that fits into what you're saying above. I agree that what you're suggesting does allow access to some very low-hanging fruit that I had not noticed; it makes sense to get that first. The rest of what David is thinking about could be done in a followon version by allowing the same aggregate to be implemented by any of several transition-function/final-function pairs, then teaching the planner to prefer pairs that let the same transition function be used for several aggregates. But I'd see that as a later refinement that might well fail the bang-for-buck test, and hence shouldn't be the first step. Well, that part will be a little tricky, because it would be infrastructure which would allow what are likely significant optimizations in several other features. There's a bit of a chicken-and-egg problem, because these other optimizations can't be written without the infrastructure, and the infrastructure will not show its full worth until the other optimizations are able to take advantage of it. But we can cross that bridge when we come to it. This doesn't look to me like the traditional 80/20 rule. I think the easy stuff might be 5% of the benefit for 1% of the work; but still a better bang for the buck than the other work. What you are proposing as an alternative to what David proposed for the later work looks (on the face of it) like a bigger, more complicated mechanism than he proposed, but more flexible if it can be made to work. What I'd hate to see is failure to get a toaster because it's too expensive to get one that also bakes pizza. We're gonna want to make a lot of toast over the next few years. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
On 2015-06-09 17:19:33 +0200, Tomas Vondra wrote: ... and yet another use case for 'aggregate state combine' that I just remembered about is grouping sets. What GROUPING SET (ROLLUP, ...) do currently is repeatedly sorting the input, once for each grouping. Actually, that's not really what happens. All aggregates that share a sort order are computed in parallel. Only when sets do not share an order additional sorts are required. What could happen in some cases is building the most detailed aggregation first, then repeatedly combine these partial states. I'm not sure that'll routinely be beneficial, because it'd require keeping track of all the individual most detailed results, no? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
Hi, On 06/09/15 12:58, David Rowley wrote: These are things along the lines of: 1. Parallel Aggregation (computes each aggregate state in parallel worker processes and then merges these states in serial mode) 2. Aggregate push-down / Aggregate before join (requires passing partially computed aggregate states between executor nodes) 3. Auto-updating Materialized views (ones which contain aggregate functions) 4. Foreign table aggregation 5. Dependant Aggregates (I talked about earlier here - http://www.postgresql.org/message-id/CAKJS1f8ebkc=EhEq+ArM8vwYZ5vSapJ1Seub5=fvrrudctf...@mail.gmail.com) Items 1-4 above I believe require support of Aggregate State Combine Support - https://commitfest.postgresql.org/5/131/ which I believe will need to be modified to implement complex database types to backup our internal aggregate state types so that these types be properly passed between executor nodes, between worker processes and perhaps foreign data wrappers (maybe just postgres_fdw I've not looked into this yet) I think yet another use case that might benefit from this would be 'memory-bounded hash aggregate'. Jeff Davis was working on a different approach that worked quite well for fixed-length states, but for handling custom states in 'internal' data type, the (de)serialization seems like a must for this use case. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
Alvaro Herrera alvhe...@2ndquadrant.com wrote: Kevin Grittner wrote: David Rowley david.row...@2ndquadrant.com wrote: 5. Dependant Aggregates Item 5 makes items 1-4 a bit more complex as with this item there's opportunity for very good performance improvements by allowing aggregates like AVG(x) also perform all the required work to allow SUM(x) and COUNT(x) to be calculated for free in a query containing all 3 aggregates. Not only CPU is saved, but the optimizations for materialized views would require the aggregate function's transition state to be saved in each row, and the duplicate state information among these functions would be a waste of space. Uh, this also requires serialization and deserialization of non- finalized transition state, no? For that sort of optimization to incremental maintenance of materialized views (when we get there), yes. That will be one of many issues to sort out. Any reason you're focusing on that now? Do you think we need to settle on a format for that to proceed with the work David is discussing? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
Kevin Grittner wrote: David Rowley david.row...@2ndquadrant.com wrote: 5. Dependant Aggregates Item 5 makes items 1-4 a bit more complex as with this item there's opportunity for very good performance improvements by allowing aggregates like AVG(x) also perform all the required work to allow SUM(x) and COUNT(x) to be calculated for free in a query containing all 3 aggregates. Not only CPU is saved, but the optimizations for materialized views would require the aggregate function's transition state to be saved in each row, and the duplicate state information among these functions would be a waste of space. Uh, this also requires serialization and deserialization of non- finalized transition state, no? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
On 06/09/15 16:10, Tomas Vondra wrote: Hi, On 06/09/15 12:58, David Rowley wrote: ... Items 1-4 above I believe require support of Aggregate State Combine Support - https://commitfest.postgresql.org/5/131/ which I believe will need to be modified to implement complex database types to backup our internal aggregate state types so that these types be properly passed between executor nodes, between worker processes and perhaps foreign data wrappers (maybe just postgres_fdw I've not looked into this yet) I think yet another use case that might benefit from this would be 'memory-bounded hash aggregate'. Jeff Davis was working on a different approach that worked quite well for fixed-length states, but for handling custom states in 'internal' data type, the (de)serialization seems like a must for this use case. ... and yet another use case for 'aggregate state combine' that I just remembered about is grouping sets. What GROUPING SET (ROLLUP, ...) do currently is repeatedly sorting the input, once for each grouping. What could happen in some cases is building the most detailed aggregation first, then repeatedly combine these partial states. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
Kevin Grittner wrote: Alvaro Herrera alvhe...@2ndquadrant.com wrote: Uh, this also requires serialization and deserialization of non- finalized transition state, no? For that sort of optimization to incremental maintenance of materialized views (when we get there), yes. That will be one of many issues to sort out. Any reason you're focusing on that now? Do you think we need to settle on a format for that to proceed with the work David is discussing? No, it's just that it wasn't on David's list. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Aggregate Supporting Functions
Kevin Grittner kgri...@ymail.com writes: David Rowley david.row...@2ndquadrant.com wrote: [ avoid duplicate calculations for related aggregates ] From the information you have proposed storing, with cost factors associated with the functions, it seems technically possible to infer that you could run (for example) the avg() aggregate to accumulate both but only run the final functions of the aggregates referenced by the query. That seems like an optimization to try hard to forget about until you have at least one real-world use case where it would yield a significant benefit. It seems premature to optimize for that before having the rest working. Actually, I would suggest that you forget about all the other aspects and *just* do that, because it could be made to work today on existing aggregate functions, and it would not require hundreds-to-thousands of lines of boilerplate support code in the grammar, catalog support, pg_dump, yadda yadda. That is, look to see which aggregates use the same transition function and run that just once. We already have the rule that the final function can't destroy the transition output, so running two different final functions on the same transition result should Just Work. The rest of what David is thinking about could be done in a followon version by allowing the same aggregate to be implemented by any of several transition-function/final-function pairs, then teaching the planner to prefer pairs that let the same transition function be used for several aggregates. But I'd see that as a later refinement that might well fail the bang-for-buck test, and hence shouldn't be the first step. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
David Rowley david.row...@2ndquadrant.com wrote: It appears to me that there's quite a few new features and optimisations on the not too distant horizon which will require adding yet more fields into pg_aggregate. These are things along the lines of: 3. Auto-updating Materialized views (ones which contain aggregate functions) Yes, that's certainly on the road map. The recent work to add support for inverse transition functions already goes a huge way toward allowing optimization of incremental maintenance of aggregates in materialized views. Items 1-4 above I believe require support of Aggregate State Combine Support - https://commitfest.postgresql.org/5/131/ Yeah, that seems likely to extend optimized cases even further. 5. Dependant Aggregates Item 5 makes items 1-4 a bit more complex as with this item there's opportunity for very good performance improvements by allowing aggregates like AVG(x) also perform all the required work to allow SUM(x) and COUNT(x) to be calculated for free in a query containing all 3 aggregates. Not only CPU is saved, but the optimizations for materialized views would require the aggregate function's transition state to be saved in each row, and the duplicate state information among these functions would be a waste of space. I've discussed item 5 off-list with Simon and he mentioned that we might invent a transition state and transition functions which can have parts switched on and off much like how calcSumX2 controls if do_numeric_accum() should calculate sum(x*x) or not. The problem with this is that if we ever want to store aggregate states in an auto-updating materialized view, then this generic aggregate state will have to contain at least 3 fields (to store count(x), sum(x) and sum(x*x)), and those 3 fields would have to be stored even if the aggregate was just a simple count(*). Yeah, I think we want to preserve the ability of count() to have a simple state, and implement dependent aggregates as discussed in the other thread -- where (as I understood it) having sum(x), count(x), and avg(x) in a query would avoid the row-by-row work for sum(x) and count(x), and just invoke a final function to extract those values from the transition state of the avg(x) aggregate. I see incremental maintenance of materialized views taking advantage of the same sort of behavior, only maintaining the state for avg(x) during incremental maintenance and *at the end* pulling the values for sum(x) and count(x) out of that. The idea I discussed in the link in item 5 above gets around this problem, but it's a perhaps more surprise filled implementation as it will mean select avg(x),sum(x),count(x) from t is actually faster than select sum(x),count(x) from t as the agg state for avg() will satisfy sum and count too. I'm skeptical that it will be noticeably faster. It's easy to see why this optimization will make a query *with all three* faster, but I would not expect the process of accumulating the sum and count to be about the same speed whether performed by one transition function or two. Of course I could be persuaded by a benchmark showing otherwise. The purpose of this email is to open the doors for discussion about this so that nobody goes off and develops feature X into a corner and disallows feature Y and so that we end up with the most optimal solution that does not trip us up in the future. That laudable. So far I don't see anything that will do anything but make the materialized view maintenance easier, at least if dependent aggregates are implemented as described in the other thread. I'm interested to hear if Kevin or Amit have had any time to give this any thought before. It would be good to ensure we all have the same vision here. It sounds great to me. I had thought about the need to deal with these issues at some point to allow optimizations to the incremental maintenance of materialized views (there needs to be a fall-back of recalculating from scratch where such optimizations are not possible, but people will not be happy with the performance of that for simple cases where it is intuitively clear that we could do better). You have developed the ideas farther than I have, and (at least on a first review) I like what I'm seeing. There may be some devils in the details down the road, but I would say that what you're doing looks like it will dovetail nicely with what's on my road map. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not adopt C locale failure at startup on Windows
Andres Freund and...@anarazel.de writes: On 2015-06-09 11:20:06 -0400, Tom Lane wrote: We've seen several reports of $SUBJECT lately. I have no idea what's going on, but it occurred to me that it could be informative to tweak init_locale() so that it reports which category failed to be set. Any objections to squeezing that into today's releases? No objection, +1. Seems fairly low risk. The error seems odd. The only even remotely related change between 9.4.1 and .2 seems to be ca325941. Could also be a build environment change. Yeah, my first instinct was to blame ca325941 as well, but I don't think any of that code executes during init_locale(). Also, http://www.postgresql.org/message-id/20150326040321.2492.24...@wrigleys.postgresql.org says it's been seen in 9.4.1. Of course, it might be premature to assume that report had an identical cause to the later ones. What I plan to do is this: static void init_locale(const char *categoryname, int category, const char *locale) { if (pg_perm_setlocale(category, locale) == NULL pg_perm_setlocale(category, C) == NULL) elog(FATAL, could not adopt either \%s\ locale or C locale for %s, locale, categoryname); } with the obvious changes to the call sites to pass the string names of the categories not just their numeric codes. (We could just log the numbers, but it'd be much harder to interpret.) This might be overkill, but when you don't know what you're looking for, every little bit helps ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
On 06/09/15 17:27, Andres Freund wrote: On 2015-06-09 17:19:33 +0200, Tomas Vondra wrote: ... and yet another use case for 'aggregate state combine' that I just remembered about is grouping sets. What GROUPING SET (ROLLUP, ...) do currently is repeatedly sorting the input, once for each grouping. Actually, that's not really what happens. All aggregates that share a sort order are computed in parallel. Only when sets do not share an order additional sorts are required. Oh, right, that's what I meant, but failed to explain clearly. What could happen in some cases is building the most detailed aggregationfirst, then repeatedly combine these partial states. I'm not sure that'll routinely be beneficial, because it'd require keeping track of all the individual most detailed results, no? Yes, it requires tracking all the detailed aggregate states. I'm not claiming this is beneficial in every case - sometimes the current sort approach will be better, sometimes the combine approach will be faster. In a sense it's similar to GroupAggregate vs. HashAggregate. I expect this 'combine' approach will be much faster is cases with many source rows, but number of groups so small the detailed states fit into work_mem. In that case you can do hashagg and then walk through the hash table to build the actual results. This entirely eliminates the expensive sorts, which is killing us in many DWH workloads (because real-world queries usually produce only few rows, even on very large data sets). But ISTM this might help even in cases when the detailed states don't fit into memory, still assuming the number of groups is much smaller than the number of source rows. Just do partial aggregation by aggregating the source rows (using hashagg) until you fill work_mem. Then either dump all the aggregate states to disk or only some of them (least frequently used?) and continue. Then you can sort the states, and assuming it's much smaller amount of data, it'll be much faster than sorting all the rows. And you can do the grouping sets using multiple sorts, just like today. Of course, this only works if the partial aggregation actually reduces the amount of data spilled to disk - if the aggregate states grow fast, or if you get the tuples in certain order, it may get ugly. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Draft release notes for 9.4.4 et al
I've pushed up draft release notes for 9.4.4 at http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=21187cfc7dfd82461db9119377a76366c00d27c3 Please review and comment ASAP, particularly if the instructions regarding avoiding emergency autovacuuming are not accurate. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for 9.4.4 et al
Hi, On 2015-06-09 13:09:27 -0400, Tom Lane wrote: I've pushed up draft release notes for 9.4.4 at http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=21187cfc7dfd82461db9119377a76366c00d27c3 Please review and comment ASAP, particularly if the instructions regarding avoiding emergency autovacuuming are not accurate. Generally looks good. Some of the references to 9.3 in the 9.4 release notes don't seem relevant though - I guess you intend to adjust them when spreading the notes to all versions? Specifically You must use productnamePostgreSQL/ 9.3.5 or later to perform this step. seems likely to confuse users, given there's no corresponding 9.4 release. I wonder if it'd be good to weaken Users can avoid that by doing manual vacuuming emphasisbefore/ upgrading to this release. a bit, further emphasizing that that's optional. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Why no jsonb_exists_path()?
Dmitry, Alexander: I'm noticing a feature gap for JSONB operators; we have no way to do this: jsonb_col ? ARRAY['key1','key2','key3'] ... that is, there is no way for us to check for key existence in an indexable fashion. Given that @ already can check the whole path including the value, is there some challenge to stopping just short of the value I'm not seeing? Or is this just a didn't get to it yet issue? I'm trying to estimate the difficulty of creating this operator in an extension (obviously we're not doing it for 9.5), and maybe for 9.6. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for 9.4.4 et al
Andres Freund and...@anarazel.de writes: On 2015-06-09 13:09:27 -0400, Tom Lane wrote: I've pushed up draft release notes for 9.4.4 at http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=21187cfc7dfd82461db9119377a76366c00d27c3 Please review and comment ASAP, particularly if the instructions regarding avoiding emergency autovacuuming are not accurate. Generally looks good. Some of the references to 9.3 in the 9.4 release notes don't seem relevant though - I guess you intend to adjust them when spreading the notes to all versions? Specifically You must use productnamePostgreSQL/ 9.3.5 or later to perform this step. seems likely to confuse users, given there's no corresponding 9.4 release. Hm. It's valid for 9.4.x users, but I agree probably unnecessary. Will drop it in the 9.4.4 version. I wonder if it'd be good to weaken Users can avoid that by doing manual vacuuming emphasisbefore/ upgrading to this release. a bit, further emphasizing that that's optional. OK. Thanks for reviewing! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for 9.4.4 et al
Andres Freund wrote: Hi, On 2015-06-09 13:09:27 -0400, Tom Lane wrote: I've pushed up draft release notes for 9.4.4 at http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=21187cfc7dfd82461db9119377a76366c00d27c3 Please review and comment ASAP, particularly if the instructions regarding avoiding emergency autovacuuming are not accurate. Generally looks good. +1 Some of the references to 9.3 in the 9.4 release notes don't seem relevant though - I guess you intend to adjust them when spreading the notes to all versions? Specifically You must use productnamePostgreSQL/ 9.3.5 or later to perform this step. seems likely to confuse users, given there's no corresponding 9.4 release. Oh, right, 9.4beta2 was released at the same time as 9.3.5, so by the time 9.4 was out, it already contained the fixes. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why no jsonb_exists_path()?
Josh Berkus j...@agliodbs.com writes: I'm noticing a feature gap for JSONB operators; we have no way to do this: jsonb_col ? ARRAY['key1','key2','key3'] ... that is, there is no way for us to check for key existence in an indexable fashion. Given that @ already can check the whole path including the value, is there some challenge to stopping just short of the value I'm not seeing? Or is this just a didn't get to it yet issue? Hm, well, the jsonb_path_ops opclass couldn't do it, because what it indexes is hashes that include the value. I suppose jsonb_ops could look for entries that match all of the keys and then see if the ordering is correct. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add pg_audit, an auditing extension
Hi Noah, On 6/8/15 10:13 AM, Noah Misch wrote: My condemnation of the pg_audit commits probably hurt you as the feature's authors. I am sorry for that. Your code was better than most Ready for Committer code, and I hope you submit more patches in the future. I appreciate you saying this and especially for saying it publicly. I've certainly had quite the experience as a first-time contributor working on this patch. Perhaps I bit off more than I should have and I definitely managed to ruffle a few feathers along the way. I learned a lot about how the community works, both the good and the bad. Fear not, though, I'm not so easily discouraged and you'll definitely be hearing more from me. My honest, albeit novice, opinion is that it was a mistake to pull pg_audit from contrib. I know more than anyone that it had flaws, mostly owing to its implementation as an extension, but it also provided capability that simply does not exist right now. Recent conversations about PGXN demonstrate why that is not (currently) a good alternative for distributing extensions. That means pg_audit will have a more limited audience than it could have had. That's a shame, because people are interested in pg_audit, warts and all. The stated purpose of contrib is: include porting tools, analysis utilities, and plug-in features that are not part of the core PostgreSQL system, mainly because they address a limited audience or are too experimental to be part of the main source tree. This does not preclude their usefulness. Perhaps we should consider modifying that language, because from my perspective pg_audit fit the description perfectly. Of course, I understand this is a community effort and I don't expect every contribution to be accepted and committed. Consider me disappointed yet determined. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Information of pg_stat_ssl visible to all users
On Tue, Jun 9, 2015 at 3:27 PM, Magnus Hagander mag...@hagander.net wrote: On Jun 9, 2015 6:00 AM, Michael Paquier michael.paqu...@gmail.com wrote: Hi all, I should have noticed that before, but it happens that pg_stat_ssl leaks information about the SSL status of all the users connected to a server. Let's imagine for example: 1) Session 1 connected through SSL with a superuser: =# create role toto login; CREATE ROLE =# select * from pg_stat_ssl; pid | ssl | version | cipher| bits | compression | clientdn ---+-+-+-+--+-+-- 33348 | t | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 | 256 | t | (1 row) 2) New session 2 with previously created user: = select * from pg_stat_ssl; pid | ssl | version | cipher| bits | compression | clientdn ---+-+-+-+--+-+-- 33348 | t | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 | 256 | t | 33367 | t | TLSv1.2 | ECDHE-RSA-AES256-GCM-SHA384 | 256 | t | (2 rows) Attached is a patch to mask those values to users that should not have access to it, similarly to the other fields of pg_stat_activity. I don't have the thread around right now (on phone), but didn't we discuss this back around the original submission and decide that this was wanted behavior? Looking back at this thread, it is mentioned here: http://www.postgresql.org/message-id/31891.1405175...@sss.pgh.pa.us What actual sensitive data is leaked? If knowing the cipher type makes it easier to hack you have a broken cipher, don't you? I am just wondering if it is a good idea to let other users know the origin of a connection to all the users. Let's imagine the case where for example the same user name is used for non-SSL and SSL sessions. This could give a hint of the activity on the server.. However, feel free to ignore those concerns if you think the current situation is fine... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add pg_audit, an auditing extension
David Steele da...@pgmasters.net writes: My honest, albeit novice, opinion is that it was a mistake to pull pg_audit from contrib. I know more than anyone that it had flaws, mostly owing to its implementation as an extension, but it also provided capability that simply does not exist right now. Recent conversations about PGXN demonstrate why that is not (currently) a good alternative for distributing extensions. That means pg_audit will have a more limited audience than it could have had. That's a shame, because people are interested in pg_audit, warts and all. FWIW, I did not think that the consensus was that pg_audit could never be in contrib. As I understood it, people felt that (1) the API might not be sufficiently stable yet, and (2) there might be parts that should be in core eventually. (2) is problematic only because we do not have very good tools for migrating things from contrib to core without creating user-visible compatibility issues; which is not pg_audit's fault but it's still a constraint we have to keep in mind. So I'd encourage you to keep working on it and trying to address the issues that were brought up. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] The purpose of the core team
There has been some confusion by old and new community members about the purpose of the core team, and this lack of understanding has caused some avoidable problems. Therefore, the core team has written a core charter and published it on our website: http://www.postgresql.org/developer/core/ Hopefully this will be helpful to people. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for 9.4.4 et al
On 06/09/2015 04:38 PM, Michael Paquier wrote: On Wed, Jun 10, 2015 at 8:31 AM, Josh Berkus j...@agliodbs.com wrote: Tom, all: First draft of the release announcement. Please improve/edit/correct. Thanks! Some comments: s/expecially/especially. Thanks, fixed This bug fix is not mentioned (worth it?): Avoid deadlock between incoming sessions and CREATE/DROP DATABASE Huh? * Avoid deadlock between new sessions and CREATE/DROP DATABASE If not mentioned the fourth * is not needed in the list. Thanks, removed. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Checkpoints vs restartpoints
Hi Why do standby servers not simply treat every checkpoint as a restartpoint? As I understand it, setting checkpoint_timeout and checkpoint_segments higher on a standby server effectively instruct standby servers to skip some checkpoints. Even with the same settings on both servers, the server could still choose to skip a checkpoint near the checkpoint_timeout limit due to the vagaries of time keeping (though I suppose it's very unlikely). But what could the advantage of skipping checkpoints be? Do people deliberately set hot standby machines up like this to trade a longer crash recover time for lower write IO? I was wondering about this in the context of the recent multixact work, since such configurations could leave you with different SLRU files on disk which in some versions might change the behaviour in interesting ways. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for 9.4.4 et al
Tom, all: First draft of the release announcement. Please improve/edit/correct. Thanks! -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com 2015-06-12 Update Release = The PostgreSQL Global Development Group has released an update to all supported versions of our database system, including 9.4.4, 9.3.9, 9.2.13, 9.1.18 and 9.0.22. This release primarily fixes issues not successfully fixed in prior releases. It should be applied as soon as possible by any user who installed the May or June update releases. Other users should apply at the next available downtime. Crash Recovery Fixes - Earlier update releases attempted to fix an issue in PostgreSQL 9.3 and 9.4 with multixact wraparound, but failed to account for issues doing multixact cleanup during crash recovery. This could cause servers to be unable to restart after a crash. As such, all users of 9.3 and 9.4 should apply this update as soon as possible, expecially if they have already applied updates 9.3.7, 9.3.8, 9.4.2 or 9.4.3. Database administrators who used pg_upgrade to upgrade to PostgreSQL version 9.3 may find that applying the update causes an immediate autovacuum of their entire database. Please see the [release notes](http://www.postgresql.org/docs/9.4/static/release-9-3-9.html) for details and ways to change the timing of the vacuum. Other Fixes and Improvements In addition to the above, a few other minor issues were patched in this release. These fixes include: * Prevent failure to invalidate relation cache init file * Avoid deadlock between new sessions and CREATE/DROP DATABASE * Improve query planning for semi-joins and anti-joins * Cumulative Releases --- All PostgreSQL update releases are cumulative. As this update release fixes a number of problems inadvertently introduced by fixes in earlier update releases, we strongly urge users to apply this update, rather than installing less recent updates that have known issues. As this update release closes all known bugs with multixact handling, the PostgreSQL Project does not anticipate additional update releases soon. Updating As with other minor releases, users are not required to dump and reload their database or use pg_upgrade in order to apply this update release; you may simply shut down PostgreSQL and update its binaries. Users who have skipped multiple update releases may need to perform additional post-update steps; see the Release Notes for details. See also the above note for users who used pg_upgrade with PostgreSQL version 9.3. Links: * [Download](http://www.postgresql.org/download) * [Release Notes](http://www.postgresql.org/docs/current/static/release.html) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for 9.4.4 et al
On Wed, Jun 10, 2015 at 8:41 AM, Josh Berkus j...@agliodbs.com wrote: On 06/09/2015 04:38 PM, Michael Paquier wrote: On Wed, Jun 10, 2015 at 8:31 AM, Josh Berkus j...@agliodbs.com wrote: Tom, all: First draft of the release announcement. Please improve/edit/correct. Thanks! Some comments: s/expecially/especially. Thanks, fixed This bug fix is not mentioned (worth it?): Avoid deadlock between incoming sessions and CREATE/DROP DATABASE Huh? Sorry. I'll blame the lack of sleep.. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for 9.4.4 et al
On Wed, Jun 10, 2015 at 8:31 AM, Josh Berkus j...@agliodbs.com wrote: Tom, all: First draft of the release announcement. Please improve/edit/correct. Thanks! Some comments: s/expecially/especially. This bug fix is not mentioned (worth it?): Avoid deadlock between incoming sessions and CREATE/DROP DATABASE If not mentioned the fourth * is not needed in the list. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpoints vs restartpoints
On Tue, Jun 9, 2015 at 4:20 PM, Thomas Munro thomas.mu...@enterprisedb.com wrote: Hi Why do standby servers not simply treat every checkpoint as a restartpoint? As I understand it, setting checkpoint_timeout and checkpoint_segments higher on a standby server effectively instruct standby servers to skip some checkpoints. Even with the same settings on both servers, the server could still choose to skip a checkpoint near the checkpoint_timeout limit due to the vagaries of time keeping (though I suppose it's very unlikely). But what could the advantage of skipping checkpoints be? Do people deliberately set hot standby machines up like this to trade a longer crash recover time for lower write IO? When a hot standby server is initially being set up using a rather old base backup and an archive directory, it could be applying WAL at a very high rate such that it would replay master checkpoints multiple times a second (when the master has long periods with little write activity and has checkpoints driven by timeouts during those periods). Actually doing restartpoints that often could be annoying. Presumably there would be few dirty buffers to write out, since each checkpoint saw little activity, but you would still have to circle the shared_buffers twice, and fsync whichever files did happen to get some changes. Cheers, Jeff
Re: [HACKERS] Why no jsonb_exists_path()?
On 06/09/2015 05:30 PM, Andrew Dunstan wrote: On 06/09/2015 02:40 PM, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: I'm noticing a feature gap for JSONB operators; we have no way to do this: jsonb_col ? ARRAY['key1','key2','key3'] ... that is, there is no way for us to check for key existence in an indexable fashion. Given that @ already can check the whole path including the value, is there some challenge to stopping just short of the value I'm not seeing? Or is this just a didn't get to it yet issue? Hm, well, the jsonb_path_ops opclass couldn't do it, because what it indexes is hashes that include the value. I suppose jsonb_ops could look for entries that match all of the keys and then see if the ordering is correct. It looks to me like we'd need to index all paths in a document, or possibly hashes of all paths in a document. I don't think anything we have now will help much, unless my understanding is way off. On second thought, yes, we could possibly check the index against the path elements and then recheck. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why no jsonb_exists_path()?
On 06/09/2015 02:40 PM, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: I'm noticing a feature gap for JSONB operators; we have no way to do this: jsonb_col ? ARRAY['key1','key2','key3'] ... that is, there is no way for us to check for key existence in an indexable fashion. Given that @ already can check the whole path including the value, is there some challenge to stopping just short of the value I'm not seeing? Or is this just a didn't get to it yet issue? Hm, well, the jsonb_path_ops opclass couldn't do it, because what it indexes is hashes that include the value. I suppose jsonb_ops could look for entries that match all of the keys and then see if the ordering is correct. It looks to me like we'd need to index all paths in a document, or possibly hashes of all paths in a document. I don't think anything we have now will help much, unless my understanding is way off. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_archivecleanup bug (invalid filename input)
Hello, Trying to use pg_archivecleanup as a: standalone archive cleaner Results in an error of: pg_archivecleanup: invalid filename input Try pg_archivecleanup --help for more information. /usr/pgsql-9.4/bin/pg_archivecleanup /backups/db1/archive 0001074800B1.00015838.backup Works. /usr/pgsql-9.4/bin/pg_archivecleanup /backups/db1/archive ./0001074800B1.00015838.backup Or /usr/pgsql-9.4/bin/pg_archivecleanup /backups/db1/archive /backups/db1/archive/0001074800B1.00015838.backup Do not and present with the error mentioned above. CentOS 6.6 with PGDG rpms. Sincerely, JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing I'm offended is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql :: support for \ev viewname and \sv viewname
Hi Patch looks excellent now. No issues. Found a typo which I have fixed in the attached patch. Thanks -- Jeevan B Chalke Principal Software Engineer, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 62a3b21..4467b8c 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -1651,6 +1651,34 @@ Tue Oct 26 21:40:57 CEST 1999 varlistentry +termliteral\ev optional replaceable class=parameterviewname/ optional replaceable class=parameterline_number/ /optional /optional /literal/term + +listitem +para + This command fetches and edits the definition of the named view, + in the form of a commandCREATE OR REPLACE VIEW/ command. + Editing is done in the same way as for literal\edit/. + After the editor exits, the updated command waits in the query buffer; + type semicolon or literal\g/ to send it, or literal\r/ + to cancel. +/para + +para + If no view is specified, a blank commandCREATE VIEW/ + template is presented for editing. +/para + +para + If a line number is specified, applicationpsql/application will + position the cursor on the specified line of the view definition. + (Note that the view definition typically does not begin on the first + line of the file.) +/para +/listitem + /varlistentry + + + varlistentry termliteral\encoding [ replaceable class=parameterencoding/replaceable ]/literal/term listitem @@ -2522,6 +2550,26 @@ testdb=gt; userinput\setenv LESS -imx4F/userinput varlistentry +termliteral\sv[+] replaceable class=parameterviewname/ /literal/term + +listitem + para + This command fetches and shows the definition of the named view, + in the form of a commandCREATE OR REPLACE VIEW/ command. + The definition is printed to the current query output channel, + as set by command\o/command. + /para + + para + If literal+/literal is appended to the command name, then the + output lines are numbered, with the first line of the view definition + being line 1. + /para +/listitem + /varlistentry + + + varlistentry termliteral\t/literal/term listitem para diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 38253fa..71a2dfd 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -49,6 +49,16 @@ #include settings.h #include variables.h +/* + * Editable database object types. + * Currently functions and views are supported. + */ +typedef enum PgObjType +{ + PgObjTypeFunction, + PgObjTypeView + /* add new editable object type here */ +} PgObjType; /* functions for use in this file */ static backslashResult exec_command(const char *cmd, @@ -59,10 +69,17 @@ static bool do_edit(const char *filename_arg, PQExpBuffer query_buf, static bool do_connect(char *dbname, char *user, char *host, char *port); static bool do_shell(const char *command); static bool do_watch(PQExpBuffer query_buf, long sleep); -static bool lookup_function_oid(const char *desc, Oid *foid); -static bool get_create_function_cmd(Oid oid, PQExpBuffer buf); -static int strip_lineno_from_funcdesc(char *func); +static bool lookup_object_oid(const char *desc, + PgObjType obj_type, + Oid *obj_oid); +static bool get_create_object_cmd(Oid oid, PQExpBuffer buf, PgObjType type); +static void format_create_view_cmd(char *view, PQExpBuffer buf); +static int strip_lineno_from_objdesc(char *obj); static void minimal_error_message(PGresult *res); +static int count_lines_in_buf(PQExpBuffer buf); +static void print_with_linenumbers(FILE *output, + char *lines, + const char *header_cmp_keyword); static void printSSLInfo(void); static bool printPsetInfo(const char *param, struct printQueryOpt *popt); @@ -612,7 +629,7 @@ exec_command(const char *cmd, func = psql_scan_slash_option(scan_state, OT_WHOLE_LINE, NULL, true); - lineno = strip_lineno_from_funcdesc(func); + lineno = strip_lineno_from_objdesc(func); if (lineno == 0) { /* error already reported */ @@ -629,12 +646,12 @@ exec_command(const char *cmd, AS $function$\n
Re: [HACKERS] psql :: support for \ev viewname and \sv viewname
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed Patch looks good to pass to committer. The new status of this patch is: Ready for Committer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpoints vs restartpoints
On Tue, Jun 9, 2015 at 05:20:23PM -0700, Jeff Janes wrote: On Tue, Jun 9, 2015 at 4:20 PM, Thomas Munro thomas.mu...@enterprisedb.com wrote: Hi Why do standby servers not simply treat every checkpoint as a restartpoint? As I understand it, setting checkpoint_timeout and checkpoint_segments higher on a standby server effectively instruct standby servers to skip some checkpoints. Even with the same settings on both servers, the server could still choose to skip a checkpoint near the checkpoint_timeout limit due to the vagaries of time keeping (though I suppose it's very unlikely). But what could the advantage of skipping checkpoints be? Do people deliberately set hot standby machines up like this to trade a longer crash recover time for lower write IO? When a hot standby server is initially being set up using a rather old base backup and an archive directory, it could be applying WAL at a very high rate such that it would replay master checkpoints multiple times a second (when the master has long periods with little write activity and has checkpoints driven by timeouts during those periods). Actually doing restartpoints that often could be annoying. Presumably there would be few dirty buffers to write out, since each checkpoint saw little activity, but you would still have to circle the shared_buffers twice, and fsync whichever files did happen to get some changes. Ah, so even thought standbys don't have to write WAL, they are fsyncing shared buffers. Where is the restart point recorded, in pg_controldata? c -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_archivecleanup bug (invalid filename input)
On Wed, Jun 10, 2015 at 7:27 AM, Joshua D. Drake j...@commandprompt.com wrote: Trying to use pg_archivecleanup as a: standalone archive cleaner Results in an error of: pg_archivecleanup: invalid filename input Try pg_archivecleanup --help for more information. /usr/pgsql-9.4/bin/pg_archivecleanup /backups/db1/archive 0001074800B1.00015838.backup Works. Yes. /usr/pgsql-9.4/bin/pg_archivecleanup /backups/db1/archive ./0001074800B1.00015838.backup Or /usr/pgsql-9.4/bin/pg_archivecleanup /backups/db1/archive /backups/db1/archive/0001074800B1.00015838.backup Do not and present with the error mentioned above. Looking at the documentation what is expected is not a path to a segment file, but only a segment file name: http://www.postgresql.org/docs/devel/static/pgarchivecleanup.html So the current behavior is correct, it is actually what SetWALFileNameForCleanup@pg_archivecleanup.c checks in the code. Speaking of which, 179cdd09 has forgotten pg_archivecleanup.c, making XLOG_DATA_FNAME_LEN not needed. XLOG_BACKUP_FNAME_LEN could be removed as well and IsBackupHistoryFileName() should be made smarter by checking that the length of the backup file name is actually 40... Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpoints vs restartpoints
On Wed, Jun 10, 2015 at 9:33 AM, Bruce Momjian wrote: Ah, so even thought standbys don't have to write WAL, they are fsyncing shared buffers. Where is the restart point recorded, in pg_controldata? c Yep. Latest checkpoint's REDO location, or ControlFile-checkPointCopy.redo. During recovery, a copy is kept as well in XLogCtlData.lastCheckPoint. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] PL/Python is too lossy with floats
On 6/1/15 5:34 AM, Marko Kreen wrote: On Wed, Mar 11, 2015 at 9:49 PM, Peter Eisentraut pete...@gmx.net wrote: On 3/3/15 9:32 AM, Marko Kreen wrote: PL/Python uses str(v) to convert float data, but is lossy by design. Only repr(v) is guaranteed to have enough precision to make floats roundtrip properly: committed In 9.3 and before, numeric is converted to float in PL/Python. Please reconsider backporting. It's been like this forever, so I don't think it's appropriate to backpatch this. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Aggregate Supporting Functions
On 10 June 2015 at 01:53, Kevin Grittner kgri...@ymail.com wrote: David Rowley david.row...@2ndquadrant.com wrote: 3. Add logic in the planner to look for look for supporting cases. With logic something along the lines of: a. Does the query have any aggregates? If not - return; b. Does the query have more than 1 aggregate? If not - return; c. Does the at least one of the aggregates have hassuppagg set to true? if not - return; d. Analyze aggregates to eliminate aggregates that are covered by another aggregate. We should use the aggregate which eliminates the most other aggregates* * For example stddev(x) will support avg(x), sum(x) and count(x) so a query such as select stddev(x), avg(x), sum(x), count(x) will eliminate avg(x), sum(x), count(x) as stddev(x) supports 3, avg(x) only supports 2, so will have to be eliminated. Are you assuming that x must match exactly among the aggregates to allow coverage? In my haste I neglected to mention that critical part :) Yeah the expression within the aggregation function must match exactly. This would mean that count(*) would not optimise but count(x) would. I believe that's an ok restriction on a first cut implementation, as that rewrite requires some more NULLability analysis. Have you given any thought to whether ties are possible and how they should be resolved? I guess ties are possible, although I can't immediately think of which of the standard aggs could cause that. I've not thought on it a great deal to be honest. I'm no too sure if pg_proc.procost is better to check or if pg_aggregate.aggtransspace would be better. I see procost is not very well set for quite a lot of functions, e.g float4pl and numeric_sum currently both have the cost of 1, which is certainly not the case. So perhaps it would be easier to just use aggtransspace, and if there's still a tie then just prefer the one that comes first in the targetlist. Likely that would be good as it keeps it predictable and allows the users to have influence on the decision. I'm a little bit concerned that someone will one day report that: SELECT avg(x), sum(x), count(x) from bigtable; Is faster than: SELECT sum(x), count(x) from bigtable; Of course, this will be just because we've made case 1 faster, NOT because we've slowed down case 2. Yeah, that seems possible (if not with these functions, at least technically possible with *some* functions), and hard to explain to a novice. I can't immediately think of a way to fix that without risking slowing down: select count(x) from bigtable; From the information you have proposed storing, with cost factors associated with the functions, it seems technically possible to infer that you could run (for example) the avg() aggregate to accumulate both but only run the final functions of the aggregates referenced by the query. That seems like an optimization to try hard to forget about until you have at least one real-world use case where it would yield a significant benefit. It seems premature to optimize for that before having the rest working. I see your point, and in my example I think your idea works well, but there's a risk of slowing things down here too for other cases. With the supporting aggregates we're just listing the aggregates we happen to calculate as a byproduct. Using their value is about as cheap as calling the final function, but with this, if we decided to use avg(x) instead of sum(x) and count(x) we really have no way to understand what else avg(x) might be doing. For example, if we pretend avg() does not exist, and we have stddev(x), sum(x) and count(x) as aggregates, given the query select sum(x), count(x) from t... stddev(x) gives us count(x) and sum(x) as a byproduct, but stddev also calculates sum(x*x), which quite likely ends up slower than just doing sum(x) and count(x) like normal. Perhaps the code that looks for these patterns could take the aggregate that supports the smallest number of supporting aggregates that's enough to cover the requirements, but still... We could perhaps add some smarts that analyses the costs of the transition function calls here which goes to ensure that stuff won't actually happen. Maybe that's phase 3 though. I don't think I want to touch it at this stage, but for sure something to remember for the future! -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] The Future of Aggregation
On 10 June 2015 at 02:52, Kevin Grittner kgri...@ymail.com wrote: David Rowley david.row...@2ndquadrant.com wrote: The idea I discussed in the link in item 5 above gets around this problem, but it's a perhaps more surprise filled implementation as it will mean select avg(x),sum(x),count(x) from t is actually faster than select sum(x),count(x) from t as the agg state for avg() will satisfy sum and count too. I'm skeptical that it will be noticeably faster. It's easy to see why this optimization will make a query *with all three* faster, but I would not expect the process of accumulating the sum and count to be about the same speed whether performed by one transition function or two. Of course I could be persuaded by a benchmark showing otherwise. Thanks for looking at this. Assuming that if we reuse the avg(x) state for count(x) and sum(x) then it will perform almost exactly like a query containing just avg(x), the only additional overhead is the call to the final functions per group, so in the following case that's likely immeasurable: /* setup */ create table millionrowtable as select generate_series(1,100)::numeric as x; /* test 1 */ SELECT sum(x) / count(x) from millionrowtable; /* test 2 */ SELECT avg(x) from millionrowtable; Test 1: 274.979 ms 272.104 ms 269.915 ms Test 2: 229.619 ms 220.703 ms 234.743 ms (About 19% slower) The good news is that it's not slower than before, so should be acceptable, though hard to explain to people. Regards David Rowley
Re: [HACKERS] Run pgindent now?
On Tue, Jun 9, 2015 at 09:56:13PM -0400, Robert Haas wrote: What I really don't want to do is apply the pgindent diff somewhat blindly, without really knowing how many cases we're improving and how many cases we're making worse. The number of times we've run pgindent and then realized later that it messed a bunch of stuff up is actually pretty high, and I find doing that to the back-branches particularly unexciting. It is doing +1M lines of C code --- I assume it always will mess something up. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Run pgindent now?
On Tue, May 26, 2015 at 2:55 PM, Robert Haas robertmh...@gmail.com wrote: This is kind of why I think that reindenting the back branches is unlikely to be productive: it only helps if you can get pgindent to do the same thing on all branches, and I bet that's going to be tough. ...but having said that and thought about this a bit more, we could actually test that rather than guessing. Suppose somebody goes and writes a script which diffs the version of each file on master with the version of the same file, if it exists, on each stable branch. And then they indent each back-branch on a private branch and do the same thing again. And then they produce a report of every file where the number of lines that differ went up rather than down. Then, we could look at the files where things got worse and try to fix whatever the issue is. Of course, it would be nice to be even more fine-grained and try to figure out whether there are files where, although the file overall ended up with fewer differences, some individual places in the file diverged. Maybe somebody with sufficient git-fu could figure out how to do that. But even without that, it seems to me that with some work, we could probably measure how good an outcome we're actually going to get here. What I really don't want to do is apply the pgindent diff somewhat blindly, without really knowing how many cases we're improving and how many cases we're making worse. The number of times we've run pgindent and then realized later that it messed a bunch of stuff up is actually pretty high, and I find doing that to the back-branches particularly unexciting. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] PL/Python is too lossy with floats
Peter Eisentraut pete...@gmx.net writes: On 6/1/15 5:34 AM, Marko Kreen wrote: Please reconsider backporting. It's been like this forever, so I don't think it's appropriate to backpatch this. http://www.postgresql.org/docs/devel/static/plpython-data.html states in so many words that floats are converted with str(). Presumably this needs an update in HEAD, but I would say that the current behavior is per specification in all back branches. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
On 10 June 2015 at 03:25, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Kevin Grittner wrote: Alvaro Herrera alvhe...@2ndquadrant.com wrote: Uh, this also requires serialization and deserialization of non- finalized transition state, no? For that sort of optimization to incremental maintenance of materialized views (when we get there), yes. That will be one of many issues to sort out. Any reason you're focusing on that now? Do you think we need to settle on a format for that to proceed with the work David is discussing? No, it's just that it wasn't on David's list. That's this part, right? I wrote: which I believe will need to be modified to implement complex database types to backup our internal aggregate state types so that these types be properly passed between executor nodes, between worker processes and perhaps foreign data wrappers (maybe just postgres_fdw I've not looked into this yet) -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Aggregate Supporting Functions
David Rowley david.row...@2ndquadrant.com wrote: The idea is that we skip a major chunk of processing in situations like: SELECT avg(x),sum(x),count(x) FROM bigtable; Because avg(x) already technically knows what the values of sum(x) and count(x) are. That has occurred to me as a possible optimization, but I hadn't gone so far as to try to determine what the possible performance improvement would be. The performance improvement of this particular case is as follows: create table bigtable as select generate_series(1,100)::numeric as x; vacuum bigtable; SELECT avg(x),sum(x),count(x) FROM bigtable; -- Query 1 Time: 390.325 ms Time: 392.297 ms Time: 400.790 ms SELECT avg(x) FROM bigtable; -- Query 2 Time: 219.700 ms Time: 215.285 ms Time: 233.691 ms With the implementation I'm proposing below, I believe that query 1 should perform almost as well as query 2. The only extra CPU work that would be done would be some extra checks during planning, and the calling of 2 simple new final functions which will extract the count(x) and sum(x) from the avg transition's state. I agree that if the increase in planning time is negligible, the performance to be gained in this example is a reduction of about 45% from the original run time. That certainly seems to make it worth considering. Implementation: 1. Add a new boolean column pg_aggregate named hassuppagg which will be set to true if the aggregate supports other aggregates. For example avg(int) will support count(int) and sum(int) 2. Add new system table named pg_aggregate_support (Or some better shorter name) This system table will be defined as follows: aspfnoid regproc, aspfnsupported regproc, aspfinalfn regproc, primary key (aspfnoid, aspfnsupported) Where in the above example aspfnoid will be avg(int) and 2 rows will exist. 1 with count(int) in aspfnsupported, and one with sum(int) in the aspfnsupported column. aspfinalfn will be a new final function which extracts the required portion of the avg's aggregate state. The caffeine hasn't had a chance to fully kick in yet, but that seems like enough information to optimize out the count() and sum() aggregations. 3. Add logic in the planner to look for look for supporting cases. With logic something along the lines of: a. Does the query have any aggregates? If not - return; b. Does the query have more than 1 aggregate? If not - return; c. Does the at least one of the aggregates have hassuppagg set to true? if not - return; d. Analyze aggregates to eliminate aggregates that are covered by another aggregate. We should use the aggregate which eliminates the most other aggregates* * For example stddev(x) will support avg(x), sum(x) and count(x) so a query such as select stddev(x), avg(x), sum(x), count(x) will eliminate avg(x), sum(x), count(x) as stddev(x) supports 3, avg(x) only supports 2, so will have to be eliminated. Are you assuming that x must match exactly among the aggregates to allow coverage? Have you given any thought to whether ties are possible and how they should be resolved? I'm a little bit concerned that someone will one day report that: SELECT avg(x), sum(x), count(x) from bigtable; Is faster than: SELECT sum(x), count(x) from bigtable; Of course, this will be just because we've made case 1 faster, NOT because we've slowed down case 2. Yeah, that seems possible (if not with these functions, at least technically possible with *some* functions), and hard to explain to a novice. I can't immediately think of a way to fix that without risking slowing down: select count(x) from bigtable; From the information you have proposed storing, with cost factors associated with the functions, it seems technically possible to infer that you could run (for example) the avg() aggregate to accumulate both but only run the final functions of the aggregates referenced by the query. That seems like an optimization to try hard to forget about until you have at least one real-world use case where it would yield a significant benefit. It seems premature to optimize for that before having the rest working. To allow users to implement aggregates which take advantage of this we'd better also expand the CREATE AGGREGATE syntax. I've not given this a huge amount of thought. The only thing I've come up with so far is; CREATE AGGREGATE avg(bigint) (FINALFUNC = avgfinal) SUPPORTS (count(bigint) = int8_avg_countfn, sum(bigint) = int8_avg_sumfn); I would try to find an existing keyword which could be used instead of adding SUPPORTS to the list. On a quick look over the keyword list EXTRACT jumped out at me as a candidate. There should be something usable in there somewhere. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
[HACKERS] Expending the use of xlog_internal.h's macros
Hi all, While looking at the code of pg_archivecleanup.c, I noticed that there is some code present to detect if a given string has the format of a WAL segment file name or of a backup file. The recent commit 179cdd09 has introduced in xlog_internal.h a set of macros to facilitate checks of pg_xlog's name format: IsPartialXLogFileName(), IsXLogFileName() and IsTLHistoryFileName(). And by looking at the code, there are some utilities where we could make use of that, like pg_resetxlog, pg_archivecleanup and pg_standby. Attached is a small refactoring patch doing so for HEAD. Regards, -- Michael diff --git a/contrib/pg_standby/pg_standby.c b/contrib/pg_standby/pg_standby.c index 2f9f2b4..4b78209 100644 --- a/contrib/pg_standby/pg_standby.c +++ b/contrib/pg_standby/pg_standby.c @@ -32,6 +32,8 @@ #include pg_getopt.h +#include access/xlog_internal.h + const char *progname; /* Options and defaults */ @@ -113,9 +115,8 @@ struct stat stat_buf; * folded in to later versions of this program. */ -#define XLOG_DATA_FNAME_LEN 24 /* Reworked from access/xlog_internal.h */ -#define XLogFileName(fname, tli, log, seg) \ +#define XLogFileNameExtended(fname, tli, log, seg) \ snprintf(fname, XLOG_DATA_FNAME_LEN + 1, %08X%08X%08X, tli, log, seg) /* @@ -182,10 +183,7 @@ CustomizableNextWALFileReady() * If it's a backup file, return immediately. If it's a regular file * return only if it's the right size already. */ - if (strlen(nextWALFileName) 24 - strspn(nextWALFileName, 0123456789ABCDEF) == 24 - strcmp(nextWALFileName + strlen(nextWALFileName) - strlen(.backup), - .backup) == 0) + if (IsBackupHistoryFileName(nextWALFileName)) { nextWALFileType = XLOG_BACKUP_LABEL; return true; @@ -261,8 +259,7 @@ CustomizableCleanupPriorWALFiles(void) * are not removed in the order they were originally written, * in case this worries you. */ -if (strlen(xlde-d_name) == XLOG_DATA_FNAME_LEN - strspn(xlde-d_name, 0123456789ABCDEF) == XLOG_DATA_FNAME_LEN +if (IsXLogFileName(xlde-d_name) strcmp(xlde-d_name + 8, exclusiveCleanupFileName + 8) 0) { #ifdef WIN32 @@ -366,7 +363,7 @@ SetWALFileNameForCleanup(void) } } - XLogFileName(exclusiveCleanupFileName, tli, log, seg); + XLogFileNameExtended(exclusiveCleanupFileName, tli, log, seg); return cleanup; } @@ -740,10 +737,7 @@ main(int argc, char **argv) * Check for initial history file: always the first file to be requested * It's OK if the file isn't there - all other files need to wait */ - if (strlen(nextWALFileName) 8 - strspn(nextWALFileName, 0123456789ABCDEF) == 8 - strcmp(nextWALFileName + strlen(nextWALFileName) - strlen(.history), - .history) == 0) + if (IsTLHistoryFileName(nextWALFileName)) { nextWALFileType = XLOG_HISTORY; if (RestoreWALFileForRecovery()) diff --git a/src/bin/pg_archivecleanup/pg_archivecleanup.c b/src/bin/pg_archivecleanup/pg_archivecleanup.c index ba6e242..ded36cc 100644 --- a/src/bin/pg_archivecleanup/pg_archivecleanup.c +++ b/src/bin/pg_archivecleanup/pg_archivecleanup.c @@ -21,6 +21,8 @@ #include pg_getopt.h +#include access/xlog_internal.h + const char *progname; /* Options and defaults */ @@ -51,11 +53,9 @@ char exclusiveCleanupFileName[MAXPGPATH]; /* the oldest file we * folded in to later versions of this program. */ -#define XLOG_DATA_FNAME_LEN 24 /* Reworked from access/xlog_internal.h */ -#define XLogFileName(fname, tli, log, seg) \ +#define XLogFileNameExtended(fname, tli, log, seg) \ snprintf(fname, XLOG_DATA_FNAME_LEN + 1, %08X%08X%08X, tli, log, seg) -#define XLOG_BACKUP_FNAME_LEN 40 /* * Initialize allows customized commands into the archive cleanup program. @@ -129,8 +129,7 @@ CleanupPriorWALFiles(void) * file. Note that this means files are not removed in the order * they were originally written, in case this worries you. */ - if (strlen(walfile) == XLOG_DATA_FNAME_LEN -strspn(walfile, 0123456789ABCDEF) == XLOG_DATA_FNAME_LEN + if (IsXLogFileName(walfile) strcmp(walfile + 8, exclusiveCleanupFileName + 8) 0) { /* @@ -202,13 +201,12 @@ SetWALFileNameForCleanup(void) * 00010010.0020.backup is after * 00010010. */ - if (strlen(restartWALFileName) == XLOG_DATA_FNAME_LEN - strspn(restartWALFileName, 0123456789ABCDEF) == XLOG_DATA_FNAME_LEN) + if (IsXLogFileName(restartWALFileName)) { strcpy(exclusiveCleanupFileName, restartWALFileName); fnameOK = true; } - else if (strlen(restartWALFileName) == XLOG_BACKUP_FNAME_LEN) + else if (IsBackupHistoryFileName(restartWALFileName)) { int args; uint32 tli = 1, @@ -225,7 +223,7 @@ SetWALFileNameForCleanup(void) * Use just the prefix of the filename, ignore everything after * first period */ - XLogFileName(exclusiveCleanupFileName, tli, log, seg); + XLogFileNameExtended(exclusiveCleanupFileName, tli, log, seg); } } diff
Re: [HACKERS] Restore-reliability mode
On Wed, Jun 03, 2015 at 04:18:37PM +0200, Andres Freund wrote: On 2015-06-03 09:50:49 -0400, Noah Misch wrote: Second, I would define the subject matter as bug fixes, testing and review, not restructuring, testing and review. Different code structures are clearest to different hackers. Restructuring, on average, adds bugs even more quickly than feature development adds them. I can't agree with this. While I agree with not doing large restructuring for 9.5, I think we can't affort not to refactor for clarity, even if that introduces bugs. Noticeable parts of our code have to frequently be modified for new features and are badly structured at the same time. While restructuring will may temporarily increase the number of bugs in the short term, it'll decrease the number of bugs long term while increasing the number of potential contributors and new features. That's obviously not to say we should just refactor for the sake of it. I think I agree with everything after your first sentence. I liked your specific proposal to split StartupXLOG(), but making broad-appeal restructuring proposals is hard. I doubt we would get good results by casting a wide net for restructuring ideas. Automated testing has a lower barrier to entry and is far less liable to make things worse instead of better. I can hope for good results from a TestSuiteFest, but not from a RestructureFest. That said, if folks initiate compelling restructure proposals, we should be willing to risk bugs from them like we risk bugs to acquire new features. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
David Rowley wrote: On 10 June 2015 at 03:25, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Kevin Grittner wrote: Alvaro Herrera alvhe...@2ndquadrant.com wrote: Uh, this also requires serialization and deserialization of non- finalized transition state, no? For that sort of optimization to incremental maintenance of materialized views (when we get there), yes. That will be one of many issues to sort out. Any reason you're focusing on that now? Do you think we need to settle on a format for that to proceed with the work David is discussing? No, it's just that it wasn't on David's list. That's this part, right? I wrote: which I believe will need to be modified to implement complex database types to backup our internal aggregate state types so that these types be properly passed between executor nodes, between worker processes and perhaps foreign data wrappers (maybe just postgres_fdw I've not looked into this yet) Hmm, sort of, I guess. If I understand Kevin correctly, he says these would be stored on disk alongside matview tuples. Anyway, it's probably not the hardest issue to solve ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
On 6/9/15 9:52 AM, Kevin Grittner wrote: Yeah, I think we want to preserve the ability of count() to have a simple state, and implement dependent aggregates as discussed in the other thread -- where (as I understood it) having sum(x), count(x), and avg(x) in a query would avoid the row-by-row work for sum(x) and count(x), and just invoke a final function to extract those values from the transition state of the avg(x) aggregate. I see incremental maintenance of materialized views taking advantage of the same sort of behavior, only maintaining the state for avg(x) during incremental maintenance and*at the end* pulling the values for sum(x) and count(x) out of that. Last I checked, Oracle forbade things like avg() in matviews. Since it's trivial to calculate avg() by hand, I don't see that as a big deal. It'd be nice to not require that, but it'd be MUCH nicer to have any kind of incremental matview update. Just trying to keep things in perspective. :) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_archivecleanup bug (invalid filename input)
On 06/09/2015 05:54 PM, Michael Paquier wrote: Looking at the documentation what is expected is not a path to a segment file, but only a segment file name: http://www.postgresql.org/docs/devel/static/pgarchivecleanup.html So the current behavior is correct, it is actually what SetWALFileNameForCleanup@pg_archivecleanup.c checks in the code. O.k so I would say: Should we adjust this behavior? It seems to me that it should accept a path. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing I'm offended is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not adopt C locale failure at startup on Windows
On Tue, Jun 09, 2015 at 12:24:02PM -0400, Tom Lane wrote: Andres Freund and...@anarazel.de writes: The error seems odd. The only even remotely related change between 9.4.1 and .2 seems to be ca325941. Could also be a build environment change. Yeah, my first instinct was to blame ca325941 as well, but I don't think any of that code executes during init_locale(). Also, http://www.postgresql.org/message-id/20150326040321.2492.24...@wrigleys.postgresql.org says it's been seen in 9.4.1. The return value check and error message were new in 9.4.1. I suspect the underlying problem has been present far longer, undetected. I can reproduce this with initdb --locale=C, postgresql-9.4.3-1-windows-binaries.zip (32-bit), Windows 7 x64, and the Windows ANSI code page set to CP936. (Choose Chinese (Simplified, PRC) in Control Panel - Region and Language - Administrative - Language for non-Unicode programs.) It is neither necessary nor sufficient to change Control Panel - Region and Language - Formats - Format. Binaries from postgresql-9.4.3-1-windows-x64-binaries.zip do not exhibit the problem. Note that CP936 is a PG_ENCODING_IS_CLIENT_ONLY() encoding. What I plan to do is this: static void init_locale(const char *categoryname, int category, const char *locale) { if (pg_perm_setlocale(category, locale) == NULL pg_perm_setlocale(category, C) == NULL) elog(FATAL, could not adopt either \%s\ locale or C locale for %s, locale, categoryname); } Good to have. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Typo fix loged vs logged.
The attached fixes a small typo in a comment. -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h index fea99c7..e526cd9 100644 --- a/src/include/catalog/pg_class.h +++ b/src/include/catalog/pg_class.h @@ -167,7 +167,7 @@ DESCR(); #define REPLICA_IDENTITY_DEFAULT 'd' /* no replica identity is logged for this relation */ #define REPLICA_IDENTITY_NOTHING 'n' -/* all columns are loged as replica identity */ +/* all columns are logged as replica identity */ #define REPLICA_IDENTITY_FULL 'f' /* * an explicitly chosen candidate key's columns are used as identity; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [idea] table partition + hash join
Hello, It might be a corner case optimization, however, it looks to me worth to share the idea and have discussion. Table partition + Hash join pushdown Hash-join logic works most effectively when inner relation can be stored within a hash table. So, it is a meaningful optimization if we can filter out inner tuples not to be referenced in join preliminary. Let's assume a table which is partitioned to four portions, and individual child relations have constraint by hash-value of its ID field. tbl_parent + tbl_child_0 ... CHECK(hash_func(id) % 4 = 0) + tbl_child_1 ... CHECK(hash_func(id) % 4 = 1) + tbl_child_2 ... CHECK(hash_func(id) % 4 = 2) + tbl_child_3 ... CHECK(hash_func(id) % 4 = 3) If someone tried to join another relation with tbl_parent using equivalence condition, like X = tbl_parent.ID, we know inner tuples that does not satisfies the condition hash_func(X) % 4 = 0 shall be never joined to the tuples in tbl_child_0. So, we can omit to load these tuples to inner hash table preliminary, then it potentially allows to split the inner hash-table. Current typical plan structure is below: HashJoin - Append - SeqScan on tbl_child_0 - SeqScan on tbl_child_1 - SeqScan on tbl_child_2 - SeqScan on tbl_child_3 - Hash - SeqScan on other_table It may be rewritable to: Append - HashJoin - SeqScan on tbl_child_0 - Hash ... Filter: hash_func(X) % 4 = 0 - SeqScan on other_table - HashJoin - SeqScan on tbl_child_1 - Hash ... Filter: hash_func(X) % 4 = 1 - SeqScan on other_table - HashJoin - SeqScan on tbl_child_2 - Hash ... Filter: hash_func(X) % 4 = 2 - SeqScan on other_table - HashJoin - SeqScan on tbl_child_3 - Hash ... Filter: hash_func(X) % 4 = 3 - SeqScan on other_table Good: - Reduction of inner hash table size, eventually, it may reduce nBatches of HashJoin. Bad: - Inner relation has to be scanned multiple times. - Additional CPU cost to evaluate relevant CHECK() constraint when Hash loads inner relation. So, unless Hash plan does not expect inner hash split, above plan is never chosen because of extra cost. However, it may make sense if work_mem is not enough to load all the inner relation at once. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers