Re: [HACKERS] [bug fix] pg_ctl fails with config-only directory
On Thu, Dec 5, 2013 at 6:30 PM, MauMau wrote: > From: "Amit Kapila" >> >> On Wed, Dec 4, 2013 at 7:57 PM, MauMau wrote: >>> >> >> Approach-2 has been discussed previously to resolve it and it doesn't seem >> to be >> a good way to handle it. Please refer link: >> http://www.postgresql.org/message-id/1339601668-sup-4...@alvh.no-ip.org >> >> You can go through that mail chain and see if there can be a better >> solution than Approach-2. > > > Thanks for the info. I understand your feeling, but we need to be > practical. I believe we should not leave a bug and inconvenience by > worrying about theory too much. In addition to the config-only directory, > the DBA with admin privs will naturally want to run "postgres -C" and > "postgres --describe-config", because they are useful and so described in > the manual. I don't see any (at least big) risk in allowing postgres > -C/--describe-config to run with admin privs. Today, I had again gone through all the discussion that happened at that time related to this problem and I found that later in discussion it was discussed something on lines as your Approach-2, please see the link http://www.postgresql.org/message-id/503a879c.6070...@dunslane.net > In addition, recent Windows > versions help to secure the system by revoking admin privs with UAC, don't > they? Disabling UAC is not recommended. > > I couldn't find a way to let postgres delete its token groups from its own > primary access token. There doesn't seem to be a reasonably clean and good > way. Wouldn't the other way to resolve this problem be reinvoke pg_ctl in non-restricted mode for the case in question? > So I had to choose approach 2. Please find attached the patch. This simple > and not-complex change worked well. I'd like to add this to 2014-1 > commitfest this weekend unless a better approach is proposed. I think it is important to resolve this problem, so please godhead and upload this patch to next CF. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE sampling is too good
http://en.wikipedia.org/wiki/Cluster_sampling http://en.wikipedia.org/wiki/Multistage_sampling I suspect the hard part will be characterising the nature of the non-uniformity in the sample generated by taking a whole block. Some of it may come from how the rows were loaded (e.g. older rows were loaded by pg_restore but newer rows were inserted retail) or from the way Postgres works (e.g. hotter rows are on blocks with fewer rows in them and colder rows are more densely packed). I would have thought that as VACUUM reclaims space it levels that issue in the long run and on average, so that it could be simply ignored? I've felt for a long time that Postgres would make an excellent test bed for some aspiring statistics research group. I would say "applied statistics" rather than "research". Nevertheless I can ask my research statistician colleagues next door about their opinion on this sampling question. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 06/12/2013 22:59, Peter Eisentraut wrote: On 12/6/13, 12:29 PM, Álvaro Hernández Tortosa wrote: What I've been trying to do is summarize what has already been discussed here and propose a solution. You say that "you can already do those thisngs", but that's not what I have read here. Greg Smith (cc'ed as I'm quoting you) was explaining this in [1]: "Right now, writing such a tool in a generic way gets so bogged down just in parsing/manipulating the postgresql.conf file that it's hard to focus on actually doing the tuning part." That was in 2008. I don't think that stance is accurate anymore. Just for me to learn about this: why is it not accurate anymore? Thanks for your patience! :) aht -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 06/12/2013 19:11, David Johnston wrote: Álvaro Hernández Tortosa wrote Note that you are not required to maintain your configuration data in a postgresql.conf-formatted file. You can keep it anywhere you like, GUI around in it, and convert it back to the required format. Most of the I think it is not a very good idea to encourage GUI tools or tools to auto-configure postgres to use a separate configuration file and then convert it to postgresql.conf. That introduces a duplicity with evil problems if either source of data is modified out-of-the-expected-way. That's why I'm suggesting a config file that is, at the same time, usable by both postgres and other external tools. That also enables other features such as editing the config file persistently through a SQL session. For my money I'd rather have a single file and/or directory-structure where raw configuration settings are saved in the current 'key = value' format with simple comments allowed and ignored by PostgreSQL. And being simple key-value the risk of "out-of-the-expected-way" changes would be minimal. What I meant by "out-of-the-expected-way" is that if you edit postgresql.conf directly rather than through a tool (assuming you're regularly using the tool), then those changes may get lost when you use the tool again. In other words, there's potentially "duplicated information", and we all know that it's not desirable. If we want to put a separate "configuration meta-data" file out there to basically provide a database from which third-party tools can pull out this information then great. I would not incorporate that same information into the main PostgreSQL configuration file/directory-structure. The biggest advantage is that the meta-data database can be readily modified without any concern regarding such changes impacting running systems upon update. Then, tools simply need to import "two" files instead of one, link together the meta-data key with the configuration key, and do whatever they were going to do anyway. Despite I think it's not ideal to have two separate, both editable, files for configuring postgresql, if: - Both would be included in the official distribution, one alongside the other one - A tool for converting the new one into the current postgresql.conf is included also with the distribution, say bin/pgconfiguration or whatever then I'd agree that it could be a great first step to both adding support for external tooling for configuring postgres, and providing new users with a lot more help if they don't use any other tool. Of course, other tools could be completely external to the postgresql distribution, but not the "alternate" configuration file and the pgconfiguration program. Would this be a good thing to do then? If indeed that target audience is going to be novices then a static text-based document is not going to be the most desirable interface to present. At worse we should simply include a comment-link at the top of the document to a web-page where an interactive tool for configuration file creation would exist. That tool, at the end of the process, could provide the user with text to copy-paste/save into a specified area on the server so the customizations made would override the installed defaults. I think both could be used a lot, editing directly a rich configuration file or using a GUI tool. I'm trying to suggest supporting both. Regards, aht -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery to backup point
On Sat, Dec 7, 2013 at 9:06 AM, MauMau wrote: > It seems that Everyone welcomed the following functionality, and I also want > it to solve some problem. But this doesn't appear to be undertaken. Indeed, nobody has really showed up to implement that. > > Recovery target 'immediate' > http://www.postgresql.org/message-id/51703751.2020...@vmware.com > Is there any technical difficulty? As far as I recall, I don't think so. The problem and the way to solve that are clear. The only trick is to be sure that recovery is done just until a consistent point is reached, and to implement that cleanly. > May I implement this feature and submit a patch for the next commitfest if I > have time? Please feel free. I might as well participate in the review. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Thu, Dec 5, 2013 at 2:19 AM, Peter Geoghegan wrote: > I did a relatively short variant 'insert' pgbench-tools benchmark, > with a serial primary index created on the pgbench_history table. > Results: > > http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/insert/ I saw something today that made me lose confidence in the results I've presented. I was unsuccessful in re-creating similar 'select' numbers at scale 15 on the same server [1] (originally I wanted to see how eliding the fmgr trapoline worked out with binary searching only). Then, when re-testing master as of commit 8e18d04d4daf34b8a557e2dc553a7754b255cd9a (my feature branches branched off from that master commit), I noticed that even after the last pgbench had run, a single postgres process was stuck at 100% CPU usage for upwards of a minute (the run referred to was for 32 clients, and I only saw that one process in 'top' output). To me this points to either 1) some kind of contamination or 2) a bug in Postgres. My first guess is that it's the issue described here [2] and elsewhere (maybe whether or not that behavior constitutes a bug in controversial, but I think it does). Consider the contrast between these two runs (against master, 2 clients) of the same, new benchmark: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/new-select/50/index.html http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/new-select/44/index.html I had considered that something like Intel Speedstep technology had a role here, but I'm pretty sure it steps up very aggressively when things are CPU bound - I tested that against a Core 2 Duo desktop a couple of years back, where it was easy to immediately provoke it by moving around desktop windows or something. I know that Greg Smith controls for that by disabling it, but I have not done so here - I assumed that he only did so because he was being particularly cautious. There are similar runs on my original 'results' benchmark (these particular should-be-comparable-but-are-not runs are with 1 client against my patch this time): http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/results/297/index.html http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/results/303/index.html References: [1] http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/new-select/ [2] http://www.postgresql.org/message-id/CAFj8pRDDa40eiP4UTrCm=+bdt0xbwf7qc8t_3y0dfqyuzk2...@mail.gmail.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
MauMau wrote > From: "Tom Lane" < > tgl@.pa > > >> There is no enthusiasm for a quick-hack solution here, and most people >> don't actually agree with your proposal that these errors should never >> get logged. So no, that is not happening. You can hack your local >> copy that way if you like of course, but it's not getting committed. > > Oh, I may have misunderstood your previous comments. I got the impression > that you and others regard those messages (except "too many clients") as > unnecessary in server log. > > 1. FATAL: the database system is starting up How about altering the message to tone down the severity by a half-step... FATAL: (probably) not! - the database system is starting up David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782236.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
David Johnston wrote > > MauMau wrote >> From: "Tom Lane" < >> tgl@.pa >> > >>> There is no enthusiasm for a quick-hack solution here, and most people >>> don't actually agree with your proposal that these errors should never >>> get logged. So no, that is not happening. You can hack your local >>> copy that way if you like of course, but it's not getting committed. >> >> Oh, I may have misunderstood your previous comments. I got the >> impression >> that you and others regard those messages (except "too many clients") as >> unnecessary in server log. >> >> 1. FATAL: the database system is starting up >> 2. FATAL: the database system is shutting down >> 3. FATAL: the database system is in recovery mode >> >> 5. FATAL: terminating walreceiver process due to administrator command >> 6. FATAL: terminating background worker \"%s\" due to administrator >> command >> >> Could you tell me why these are necessary in server log? I guess like >> this. >> Am I correct? >> >> * #1 through #3 are necessary for the DBA to investigate and explain to >> the >> end user why he cannot connect to the database. >> >> * #4 and #5 are unnecessary for the DBA. I can't find out any reason why >> these are useful for the DBA. > For me 1-3 are unusual events in production situations and so knowing when > they occur, and confirming they occurred for a good reason, is a key job > of the DBA. > > 5 and 6: I don't fully understand when they would happen but likely fall > into the same "the DBA should know what is going on with their server and > confirm any startup/shutdown activity it is involved with". > > They might be better categorized "NOTICE" level if they were in response > to a administrator action, versus in response to a crashed process, but > even for the user-initiated situation making sure they hit the log but > using FATAL is totally understandable and IMO desirable. > > I'd ask in what situations are these messages occurring so frequently that > they are becoming noise instead of useful data? Sorry if I missed your > use-case explanation up-thread. > > David J. Went and scanned the thread: PITR/Failover is not generally that frequent an occurrence but I will agree that these events are likely common during such. Maybe PITR/Failover mode can output something in the logs to alleviate user angst about these frequent events? I'm doubting that a totally separate mechanism can be used for this "mode" but instead of looking for things to remove how about adding some additional coddling to the logs and the beginning and end of the mode change? Thought provoking only as I have not actually been a user of said feature. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782235.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
MauMau wrote > From: "Tom Lane" < > tgl@.pa > > >> There is no enthusiasm for a quick-hack solution here, and most people >> don't actually agree with your proposal that these errors should never >> get logged. So no, that is not happening. You can hack your local >> copy that way if you like of course, but it's not getting committed. > > Oh, I may have misunderstood your previous comments. I got the impression > that you and others regard those messages (except "too many clients") as > unnecessary in server log. > > 1. FATAL: the database system is starting up > 2. FATAL: the database system is shutting down > 3. FATAL: the database system is in recovery mode > > 5. FATAL: terminating walreceiver process due to administrator command > 6. FATAL: terminating background worker \"%s\" due to administrator > command > > Could you tell me why these are necessary in server log? I guess like > this. > Am I correct? > > * #1 through #3 are necessary for the DBA to investigate and explain to > the > end user why he cannot connect to the database. > > * #4 and #5 are unnecessary for the DBA. I can't find out any reason why > these are useful for the DBA. For me 1-3 are unusual events in production situations and so knowing when they occur, and confirming they occurred for a good reason, is a key job of the DBA. 5 and 6: I don't fully understand when they would happen but likely fall into the same "the DBA should know what is going on with their server and confirm any startup/shutdown activity it is involved with". They might be better categorized "NOTICE" level if they were in response to a administrator action, versus in response to a crashed process, but even for the user-initiated situation making sure they hit the log but using FATAL is totally understandable and IMO desirable. I'd ask in what situations are these messages occurring so frequently that they are becoming noise instead of useful data? Sorry if I missed your use-case explanation up-thread. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782234.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Recovery to backup point
Hello, It seems that Everyone welcomed the following functionality, and I also want it to solve some problem. But this doesn't appear to be undertaken. Recovery target 'immediate' http://www.postgresql.org/message-id/51703751.2020...@vmware.com Is there any technical difficulty? May I implement this feature and submit a patch for the next commitfest if I have time? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Tom Lane" There is no enthusiasm for a quick-hack solution here, and most people don't actually agree with your proposal that these errors should never get logged. So no, that is not happening. You can hack your local copy that way if you like of course, but it's not getting committed. Oh, I may have misunderstood your previous comments. I got the impression that you and others regard those messages (except "too many clients") as unnecessary in server log. 1. FATAL: the database system is starting up 2. FATAL: the database system is shutting down 3. FATAL: the database system is in recovery mode 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \"%s\" due to administrator command Could you tell me why these are necessary in server log? I guess like this. Am I correct? * #1 through #3 are necessary for the DBA to investigate and explain to the end user why he cannot connect to the database. * #4 and #5 are unnecessary for the DBA. I can't find out any reason why these are useful for the DBA. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
> "Tom" == Tom Lane writes: Tom> Not wanting to consider the sort args when there's more than one Tom> doesn't square with forcing them to be considered when there's Tom> just one. It's the same aggregate after all, This logic is only applied in the patch to aggregates that _aren't_ hypothetical. (thinking out loud:) It might be more consistent to also add the condition that the single sort column not be a variadic arg. And/or the condition that it be the same type as the result. Or have a flag in pg_aggregate to say "this agg returns one of its sorted input values, so preserve the collation". >> Consider a construct like: >> select max(common_val) >> from (select mode() within group (order by textcol) as common_val >> from ... group by othercol) s; Tom> AFAICT none of the SQL-spec aggregates expose the kind of case Tom> I'm worried about, because none of the ones that can take Tom> multiple sort columns have a potentially collatable return type. None of the spec's ordered-set functions expose any collation issue at all, because they _all_ have non-collatable return types, period. The problem only arises from the desire to make functions like percentile_disc and mode applicable to collatable types. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] writable FDWs / update targets confusion
Tomas Vondra writes: > I think that we should make the documentation more explicit about this > limitation, because the current wording in fdw-callbacks documentation > seems to suggest it's possible to add such hidden columns. At least > that's how I read it before running into this. You can add hidden columns if you've got 'em ;-). What's missing is the ability to create any hidden columns other than the ones in standard PG tables. What we most likely need is the ability for an FDW to override the type assigned to the CTID column at foreign table creation. (We'd then also need to think about where such a column could be shoehorned into a tuple, but the catalog support has to come first.) Alternatively, it might work to append "junk" columns in the user column numbering domain, which would only exist in runtime tuple descriptors and not in the catalogs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> Meh. I don't think you can have that and also have the behavior > Tom> that multiple ORDER BY items aren't constrained to have the same > Tom> collation; at least not without some rule that amounts to a > Tom> special case for percentile_disc, which I'd resist. > What the submitted patch does (as discussed in the comment in > parse_collate) is to treat the sort argument as contributing to the > collation only if there is exactly one sort arg. Blech :-( Not wanting to consider the sort args when there's more than one doesn't square with forcing them to be considered when there's just one. It's the same aggregate after all, and in general the semantics ought to be the same whether you give one sort col or three. We might be able to make this work sanely by considering only argument positions that were declared something other than ANY (anyelement being other than that, so this isn't leaving polymorphics in the cold entirely). This is a bit unlike the normal rules for collation assignment but it doesn't result in weird semantics changes depending on how many sort columns you supply. > Consider a construct like: > select max(common_val) > from (select mode() within group (order by textcol) as common_val > from ... group by othercol) s; AFAICT none of the SQL-spec aggregates expose the kind of case I'm worried about, because none of the ones that can take multiple sort columns have a potentially collatable return type. I don't think that justifies not thinking ahead, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] writable FDWs / update targets confusion
On 18.11.2013 09:28, Albe Laurenz wrote: > Tom Lane wrote: >>> Tom, could you show us a rope if there is one? >> >> What is it you actually need to fetch? >> >> IIRC, the idea was that most FDWs would do the equivalent of fetching the >> primary-key columns to use in an update. If that's what you need, then >> AddForeignUpdateTargets should identify those columns and generate Vars >> for them. postgres_fdw is probably not a good model since it's using >> ctid (a non-portable thing) and piggybacking on the existence of a tuple >> header field to put that in. >> >> If you're dealing with some sort of hidden tuple identity column that >> works like CTID but doesn't fit in six bytes, there may not be any good >> solution in the current state of the FDW support. As I mentioned, we'd >> batted around the idea of letting FDWs define a system column with some >> other datatype besides TID, but we'd not figured out all the nitty >> gritty details in time for 9.3. > > I was hoping for the latter (a hidden column). > > But I'll have to settle for primary keys, which is also ok. I was hoping for the latter too, and I can't really switch to primary key columns. I can live with 6B passed in the ctid field for now, but it'd be nice to be able to use at least 8B. I think that we should make the documentation more explicit about this limitation, because the current wording in fdw-callbacks documentation seems to suggest it's possible to add such hidden columns. At least that's how I read it before running into this. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
> "Tom" == Tom Lane writes: Tom> 2. For an ordered set function, n must equal aggnfixedargs. We Tom> treat all n fixed arguments as contributing to the aggregate's Tom> result collation, but ignore the sort arguments. >> That doesn't work for getting a sensible collation out of >> percentile_disc applied to a collatable type. (Which admittedly is >> an extension to the spec, which allows only numeric and interval, >> but it seems to me to be worth having.) Tom> Meh. I don't think you can have that and also have the behavior Tom> that multiple ORDER BY items aren't constrained to have the same Tom> collation; at least not without some rule that amounts to a Tom> special case for percentile_disc, which I'd resist. What the submitted patch does (as discussed in the comment in parse_collate) is to treat the sort argument as contributing to the collation only if there is exactly one sort arg. Consider a construct like: select max(common_val) from (select mode() within group (order by textcol) as common_val from ... group by othercol) s; (the same arguments for percentile_disc also apply to mode() and to any other ordered set function that returns a value chosen from its input sorted set) Having this sort of thing not preserve the collation of textcol (or fail) would be, IMO, surprising and undesirable. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth writes: > The patch as submitted answers those questions as follows: > CREATE AGGREGATE func(integer) WITHIN GROUP (text) ... You've glossed over a significant amount of complexity, as shown by your example that prints WITHIN GROUP (*), a syntax that you've not defined here. In the long run we might think it worthwhile to actually store two separate arglists for ordered-set aggregates; probably, pg_proc.proargs would just describe the direct arguments and there'd be a second oidvector in pg_aggregate that would describe the ORDER BY arguments. This'd let them be independently VARIADIC, or not. I'm not proposing we do that right now, because we don't have any use-cases that aren't sufficiently handled by the hack of letting a single VARIADIC ANY entry cover both sets of arguments. I'd like though that the external syntax not be something that prevents that from ever happening, and I'm afraid that this (*) business is cheating enough to be a problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
> "Josh" == Josh Berkus writes: >> Since I don't particularly trust my own judgement on aesthetics, I >> used the feedback I got from others when deciding what to >> do. Frankly I think this one needs wider input than just you and >> me arguing over it. Josh> Can someone paste examples of the two syntax alternatives we're Josh> talking about here? I've lost track. OK. The starting point is that this is the calling syntax for ordered set funcs, set by the spec: SELECT func(foo) WITHIN GROUP (ORDER BY bar) FROM ... where "foo" and "bar" might be fixed types, or polymorphic or variadic. So we need to define (with no help from the spec) at least these: - what syntax is used in CREATE AGGREGATE to specify the number and types of parameters for a newly defined "func" - what syntax is used to refer to the function in a GRANT ... ON FUNCTION ... statement, or ALTER ... OWNER TO ... - what should ::regprocedure casts accept as input and produce as output - what output is shown in \df and \da when listing the function's argument types The patch as submitted answers those questions as follows: CREATE AGGREGATE func(integer) WITHIN GROUP (text) ... GRANT ... ON FUNCTION func(integer,text) ... (there is no GRANT ... ON AGGREGATE ... (yet)) ALTER AGGREGATE func(integer) WITHIN GROUP (text) OWNER TO ... ALTER AGGREGATE func(integer,text) OWNER TO ... ALTER FUNCTION func(integer,text) OWNER TO ... (all three of those are equivalent) regprocedure outputs 'func(integer,text)' and accepts only that as input postgres=# \df func List of functions Schema | Name | Result data type | Argument data types | Type +--+--+---+-- public | func | text | (integer) WITHIN GROUP (text) | agg (1 row) If there's also a func() that isn't an ordered set function, you get output like this (which provoked tom's "schitzophrenic" comment): postgres=# \df ran[a-z]{1,5} List of functions Schema | Name | Result data type |Argument data types| Type +--+--+---+ pg_catalog | random | double precision | | normal pg_catalog | rangesel | double precision | internal, oid, internal, integer | normal pg_catalog | rank | bigint | | window pg_catalog | rank | bigint | (VARIADIC "any") WITHIN GROUP (*) | agg (4 rows) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 12/6/13, 12:29 PM, Álvaro Hernández Tortosa wrote: > What I've been trying to do is summarize what has already been > discussed here and propose a solution. You say that "you can already do > those thisngs", but that's not what I have read here. Greg Smith (cc'ed > as I'm quoting you) was explaining this in [1]: > > "Right now, writing such a tool in a generic way gets so bogged down > just in parsing/manipulating the postgresql.conf file that it's hard to > focus on actually doing the tuning part." That was in 2008. I don't think that stance is accurate anymore. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Josh Berkus writes: > Can someone paste examples of the two syntax alternatives we're talking > about here? I've lost track. I'll leave it to Andrew to describe/defend exactly what his patch is doing. The alternative I'm thinking about is that in CREATE AGGREGATE as well as \da output, the argument list of an ordered-set aggregate would look like type1, type2, ... ORDER BY type3, type4, ... if the aggregate only permits a fixed number of ordering columns (as mode() does for example). If it permits a variable number of ordering columns, you could have type1, type2, ... ORDER BY [ type3, type4, ... ] VARIADIC something 99% of the time the right-hand part would just be "VARIADIC ANY" but if an aggregate had need to lock down the ordering column types, or the leading ordering column types, it could do that. If it needs a variable number of direct arguments as well (which in particular hypothetical set functions do) then you would write [ type1, type2, ... ] VARIADIC something ORDER BY VARIADIC something where we constrain the two "somethings" to be the same. (Again, these would *usually* be ANY, but I can envision that an aggregate might want to constrain the argument types more than that.) We have to constrain this case because the underlying pg_proc representation and parser function lookup code only allow the last argument to be declared VARIADIC. So under the hood, this last case would be represented in pg_proc with proargs looking like just "[ type1, type2, ... ] VARIADIC something", whereas in the first two cases the proargs representation would contain all the same entries as above. We could substitute something else for ORDER BY without doing a lot of violence to this, if people are really down on that aspect. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wildcard usage enhancements in .pgpass
On 11/17/13, 1:56 PM, Martijn van Oosterhout wrote: > Looks interesting, though I wonder if you could use fnmatch(3) here. Or > woud that match more than you expect? For example, it would allow > 'foo*bar' to match 'foo.bar' which your code doesn't. The question is whether you'd want that. We had previously considered using fnmatch() for wildcard matching of host names in certificates and ended up deciding against that. It would be worth checking that old thread. It would also be good if the pgpass wildcard matching were somehow consistent with certificates, since they are both somewhat related security features. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
> "Tom" == Tom Lane writes: Tom> Another thing to think about here is to wonder why the committee chose Tom> anything as verbose as "agg(...) WITHIN GROUP (ORDER BY ...)" in the Tom> first place. The words ORDER BY certainly seem pretty unnecessary. All of the ordered-set functions that the spec defines are intimately tied to ordering of values, and they allow things like DESC ordering to affect the results, so it seems obvious to me that they used (ORDER BY ...) because what follows is both syntactically and semantically similar to any other use of ORDER BY. (Similar logic seems to have been used for "FILTER (WHERE ...)".) (The name "ordered set function" also seems quite specific.) So I don't think there's any greater chance of the spec people adding a WITHIN GROUP (something_other_than_order_by) construct than of them adding any other awkward piece of syntax - and if they did, it would represent an entirely new class of aggregate functions, separate from ordered set functions and no doubt with its own headaches. (I realize that expecting sanity from the spec writers is perhaps unwise) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
On 12/06/2013 01:30 PM, Andrew Gierth wrote: > Since I don't particularly trust my own judgement on aesthetics, I used > the feedback I got from others when deciding what to do. Frankly I think > this one needs wider input than just you and me arguing over it. Can someone paste examples of the two syntax alternatives we're talking about here? I've lost track. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
On Fri, Dec 6, 2013 at 12:24 PM, Tom Lane wrote: >> There seems to be no problem even if we use bigint as the type of >> unsigned 32-bit integer like queryid. For example, txid_current() >> returns the transaction ID, i.e., unsigned 32-bit integer, as bigint. >> Could you tell me what the problem is when using bigint for queryid? > > We're talking about the output of some view, right, not internal storage? > +1 for using bigint for that. Using OID is definitely an abuse, because > the value *isn't* an OID. And besides, what if we someday decide we need > 64-bit keys not 32-bit? Fair enough. I was concerned about the cost of external storage of 64-bit integers (unlike query text, they might have to be stored many times for many distinct intervals or something like that), but in hindsight that was fairly miserly of me. Attached revision displays signed 64-bit integers instead. -- Peter Geoghegan diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile new file mode 100644 index e8aed61..95a2767 *** a/contrib/pg_stat_statements/Makefile --- b/contrib/pg_stat_statements/Makefile *** MODULE_big = pg_stat_statements *** 4,11 OBJS = pg_stat_statements.o EXTENSION = pg_stat_statements ! DATA = pg_stat_statements--1.1.sql pg_stat_statements--1.0--1.1.sql \ ! pg_stat_statements--unpackaged--1.0.sql ifdef USE_PGXS PG_CONFIG = pg_config --- 4,11 OBJS = pg_stat_statements.o EXTENSION = pg_stat_statements ! DATA = pg_stat_statements--1.2.sql pg_stat_statements--1.1--1.2.sql \ ! pg_stat_statements--1.0--1.1.sql pg_stat_statements--unpackaged--1.0.sql ifdef USE_PGXS PG_CONFIG = pg_config diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql new file mode 100644 index ...7824e3e *** a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql --- b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql *** *** 0 --- 1,43 + /* contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql */ + + -- complain if script is sourced in psql, rather than via ALTER EXTENSION + \echo Use "ALTER EXTENSION pg_stat_statements UPDATE" to load this file. \quit + + /* First we have to remove them from the extension */ + ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements; + ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements(); + + /* Then we can drop them */ + DROP VIEW pg_stat_statements; + DROP FUNCTION pg_stat_statements(); + + /* Now redefine */ + CREATE FUNCTION pg_stat_statements( + OUT userid oid, + OUT dbid oid, + OUT queryid bigint, + OUT query text, + OUT calls int8, + OUT total_time float8, + OUT rows int8, + OUT shared_blks_hit int8, + OUT shared_blks_read int8, + OUT shared_blks_dirtied int8, + OUT shared_blks_written int8, + OUT local_blks_hit int8, + OUT local_blks_read int8, + OUT local_blks_dirtied int8, + OUT local_blks_written int8, + OUT temp_blks_read int8, + OUT temp_blks_written int8, + OUT blk_read_time float8, + OUT blk_write_time float8 + ) + RETURNS SETOF record + AS 'MODULE_PATHNAME' + LANGUAGE C; + + CREATE VIEW pg_stat_statements AS + SELECT * FROM pg_stat_statements(); + + GRANT SELECT ON pg_stat_statements TO PUBLIC; diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.1.sql b/contrib/pg_stat_statements/pg_stat_statements--1.1.sql new file mode . index 42e4d68..e69de29 *** a/contrib/pg_stat_statements/pg_stat_statements--1.1.sql --- b/contrib/pg_stat_statements/pg_stat_statements--1.1.sql *** *** 1,43 - /* contrib/pg_stat_statements/pg_stat_statements--1.1.sql */ - - -- complain if script is sourced in psql, rather than via CREATE EXTENSION - \echo Use "CREATE EXTENSION pg_stat_statements" to load this file. \quit - - -- Register functions. - CREATE FUNCTION pg_stat_statements_reset() - RETURNS void - AS 'MODULE_PATHNAME' - LANGUAGE C; - - CREATE FUNCTION pg_stat_statements( - OUT userid oid, - OUT dbid oid, - OUT query text, - OUT calls int8, - OUT total_time float8, - OUT rows int8, - OUT shared_blks_hit int8, - OUT shared_blks_read int8, - OUT shared_blks_dirtied int8, - OUT shared_blks_written int8, - OUT local_blks_hit int8, - OUT local_blks_read int8, - OUT local_blks_dirtied int8, - OUT local_blks_written int8, - OUT temp_blks_read int8, - OUT temp_blks_written int8, - OUT blk_read_time float8, - OUT blk_write_time float8 - ) - RETURNS SETOF record - AS 'MODULE_PATHNAME' - LANGUAGE C; - - -- Register a view on the function for ease of use. - CREATE VIEW pg_stat_statements AS - SELECT * FROM pg_stat_statements(); - - GRANT SELECT ON pg_stat_statements TO PUBLIC; - - -- Don't want this to be available to non-superusers. - REVOKE ALL ON FUNCTION pg_stat_statements_reset() FROM PUBLIC; --- 0
Re: [HACKERS] WITHIN GROUP patch
> "Tom" == Tom Lane writes: >> pg_get_function_arguments()'s interface assumes that the caller is >> providing the enclosing parens. Changing it would have meant >> returning a result like 'float8) WITHIN GROUP (float8' which I >> reckoned would have too much chance of breaking existing callers. Tom> Well, as it stands, existing callers are broken a fortiori; Not in most cases, because using the output of pg_get_function_arguments works in all contexts except for constructing the CREATE AGGREGATE statement (for example, a function that uses the pg_get_function_arguments output to construct GRANTs or ALTER OWNERs would still work). And since constructing a correct CREATE AGGREGATE statement for an ordered set function requires that the caller know about the additional options to supply in the body, this does not seem like a restriction. Tom> Of course, if we define the right output as being just the Tom> arguments according to pg_proc, it's fine. The patch accepts the 'just the arguments according to pg_proc' as input everywhere except in CREATE AGGREGATE, in addition to the syntax with explicit WITHIN GROUP. (With the exception of GRANT, as it happens, where since there is no existing GRANT ON AGGREGATE variant and the patch doesn't add one, only the pg_proc arguments form is accepted.) Tom> But your point about the parens is a good one anyway, because Tom> now that I think about it, what \da has traditionally printed is Tom> pg_catalog | sum | bigint | integer | sum as bigint acro Tom> ss all integer input values Tom> and I see the patch makes it do this Tom> pg_catalog | sum | bigint | (integer) | sum as bigint acro Tom> which I cannot find to be an improvement, especially since \df Tom> does not look like that. (As patched, the output of "\df ran*" Tom> is positively schizophrenic.) Since I don't particularly trust my own judgement on aesthetics, I used the feedback I got from others when deciding what to do. Frankly I think this one needs wider input than just you and me arguing over it. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> Actually, now that I think of it, why not use this syntax for > Tom> declaration and display purposes: > Tom> type1, type2 ORDER BY type3, type4 > But unfortunately it looks exactly like the calling sequence for a > normal aggregate with an order by clause - I really think that is > potentially too much confusion. I thought about that too, but really that ship sailed long ago, and it went to sea under the SQL committee's captaincy, so it's not our fault. There are already at least four different standards-blessed ways you can use ORDER BY in a query, some of them quite nearby (eg window functions); so the potential for confusion is there no matter what we do. In this case, if we describe ordered-set aggregates using WITHIN GROUP rather than ORDER BY, we might avoid confusion with the normal-aggregate case, but instead we will have confusion about what the arguments even do. Is semantic confusion better than syntactic confusion? Another thing to think about here is to wonder why the committee chose anything as verbose as "agg(...) WITHIN GROUP (ORDER BY ...)" in the first place. The words ORDER BY certainly seem pretty unnecessary. I'm suspicious that they might've been leaving the door open to put other things into the second set of parens later --- GROUP BY, maybe? So down the road, we might regret it if we key off WITHIN GROUP and not ORDER BY. Having said that, I'm not so dead set on it that I won't take WITHIN GROUP if that's what more people want. But we gotta lose the extra parens; they are just too strange for function declaration/documentation purposes. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth wrote >> "Tom" == Tom Lane < > tgl@.pa > > writes: > > >> Please don't object that that doesn't look exactly like the syntax > >> for calling the function, because it doesn't anyway --- remember > >> you also need ORDER BY in the call. > > Tom> Actually, now that I think of it, why not use this syntax for > Tom> declaration and display purposes: > > Tom> type1, type2 ORDER BY type3, type4 > > Tom> This has nearly as much relationship to the actual calling > Tom> syntax as the WITHIN GROUP proposal does, > > But unfortunately it looks exactly like the calling sequence for a > normal aggregate with an order by clause - I really think that is > potentially too much confusion. (It's one thing not to look like > the calling syntax, it's another to look exactly like a valid > calling sequence for doing something _different_.) > > -- > Andrew (irc:RhodiumToad) How about: type1, type2 GROUP ORDER type3, type4 OR GROUP type1, type2 ORDER type3, type4 David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Re-WITHIN-GROUP-patch-tp5773851p5782202.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
> "Tom" == Tom Lane writes: >> Please don't object that that doesn't look exactly like the syntax >> for calling the function, because it doesn't anyway --- remember >> you also need ORDER BY in the call. Tom> Actually, now that I think of it, why not use this syntax for Tom> declaration and display purposes: Tom> type1, type2 ORDER BY type3, type4 Tom> This has nearly as much relationship to the actual calling Tom> syntax as the WITHIN GROUP proposal does, But unfortunately it looks exactly like the calling sequence for a normal aggregate with an order by clause - I really think that is potentially too much confusion. (It's one thing not to look like the calling syntax, it's another to look exactly like a valid calling sequence for doing something _different_.) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reference to parent query from ANY sublink
Kevin Grittner wrote: > test=# SELECT * > FROM tab1 a > LEFT JOIN > tab2 b > ON a.i = ANY ( > SELECT k > FROM tab3 c > WHERE k = a.i); > i | j > ---+--- > 1 | 4 > 1 | 5 > 1 | 6 > 2 | > 3 | 4 > 3 | 5 > 3 | 6 > (7 rows) > >> SELECT * >> FROM tab1 a >> LEFT JOIN >> ( >> SELECT * >> tab2 b >> SEMI JOIN >> ( SELECT k >> FROM tab3 c >> WHERE k = a.i >> ) AS ANY_subquery >> ON a.i = ANY_subquery.k >> ) AS SJ_subquery >> ON true; > > It is hard to see what you intend here Perhaps you were looking for a way to formulate it something like this?: test=# SELECT * test-# FROM tab1 a test-# LEFT JOIN LATERAL test-# ( test(# SELECT * test(# FROM tab2 b test(# WHERE EXISTS test(# ( test(# SELECT * test(# FROM tab3 c test(# WHERE c.k = a.i test(# ) test(# ) AS SJ_subquery test-# ON true; i | j ---+--- 1 | 4 1 | 5 1 | 6 2 | 3 | 4 3 | 5 3 | 6 (7 rows) Without LATERAL you get an error: ERROR: invalid reference to FROM-clause entry for table "a" LINE 11: WHERE c.k = a.i ^ -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
Fujii Masao writes: > On Sun, Nov 24, 2013 at 10:58 AM, Peter Geoghegan wrote: >> I decided that queryid should be of type oid, not bigint. This is >> arguably a slight abuse of notation, but since ultimately Oids are >> just abstract object identifiers (so say the docs), but also because >> there is no other convenient, minimal way of representing unsigned >> 32-bit integers in the view that I'm aware of, I'm inclined to think >> that it's appropriate. > There seems to be no problem even if we use bigint as the type of > unsigned 32-bit integer like queryid. For example, txid_current() > returns the transaction ID, i.e., unsigned 32-bit integer, as bigint. > Could you tell me what the problem is when using bigint for queryid? We're talking about the output of some view, right, not internal storage? +1 for using bigint for that. Using OID is definitely an abuse, because the value *isn't* an OID. And besides, what if we someday decide we need 64-bit keys not 32-bit? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
I wrote: > One possibility is to forget all the parens and say that the display > looks like > type1, type2 WITHIN GROUP type3, type4 > Please don't object that that doesn't look exactly like the syntax > for calling the function, because it doesn't anyway --- remember you > also need ORDER BY in the call. Actually, now that I think of it, why not use this syntax for declaration and display purposes: type1, type2 ORDER BY type3, type4 This has nearly as much relationship to the actual calling syntax as the WITHIN GROUP proposal does, and it's hugely saner from a semantic standpoint, because after all the ordering columns are ordering columns, not grouping columns. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> Regardless of that, though ... what is the reasoning for > Tom> inventing pg_get_aggregate_arguments() rather than just making > Tom> pg_get_function_arguments() do the right thing? > pg_get_function_arguments()'s interface assumes that the caller is > providing the enclosing parens. Changing it would have meant returning > a result like 'float8) WITHIN GROUP (float8' which I reckoned would > have too much chance of breaking existing callers. Well, as it stands, existing callers are broken a fortiori; they can't possibly produce the right output for an ordered-set aggregate, if we define the "right output" as looking like that. Of course, if we define the right output as being just the arguments according to pg_proc, it's fine. But your point about the parens is a good one anyway, because now that I think about it, what \da has traditionally printed is pg_catalog | sum | bigint | integer | sum as bigint acro ss all integer input values and I see the patch makes it do this pg_catalog | sum | bigint | (integer) | sum as bigint acro which I cannot find to be an improvement, especially since \df does not look like that. (As patched, the output of "\df ran*" is positively schizophrenic.) One possibility is to forget all the parens and say that the display looks like type1, type2 WITHIN GROUP type3, type4 Please don't object that that doesn't look exactly like the syntax for calling the function, because it doesn't anyway --- remember you also need ORDER BY in the call. Or perhaps we could abbreviate that --- maybe just WITHIN? Abbreviation would be a good thing, especially if we're going to say 'VARIADIC "any"' twice in common cases. OTOH I'm not sure we can make that work as a declaration syntax without reserving WITHIN. I think WITHIN GROUP would work, though I've not tried to see if bison likes it. Anyway, the long and the short of this is that the SQL committee's bizarre choice of syntax for calling these functions should not be followed too slavishly when declaring them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
> "Tom" == Tom Lane writes: Tom> Regardless of that, though ... what is the reasoning for Tom> inventing pg_get_aggregate_arguments() rather than just making Tom> pg_get_function_arguments() do the right thing? pg_get_function_arguments()'s interface assumes that the caller is providing the enclosing parens. Changing it would have meant returning a result like 'float8) WITHIN GROUP (float8' which I reckoned would have too much chance of breaking existing callers. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
On Sun, Nov 24, 2013 at 10:58 AM, Peter Geoghegan wrote: > On Mon, Nov 18, 2013 at 1:54 AM, Sameer Thakur wrote: >> Please find v10 of patch attached. This patch addresses following >> review comments > > I've cleaned this up - revision attached - and marked it "ready for > committer". > > I decided that queryid should be of type oid, not bigint. This is > arguably a slight abuse of notation, but since ultimately Oids are > just abstract object identifiers (so say the docs), but also because > there is no other convenient, minimal way of representing unsigned > 32-bit integers in the view that I'm aware of, I'm inclined to think > that it's appropriate. There seems to be no problem even if we use bigint as the type of unsigned 32-bit integer like queryid. For example, txid_current() returns the transaction ID, i.e., unsigned 32-bit integer, as bigint. Could you tell me what the problem is when using bigint for queryid? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
I don't especially like the syntax you invented for listing arguments in CREATE AGGREGATE, especially not the WITHIN GROUP (*) special case. If we stick with that we're going to need to touch a lot more places than you have, notably regprocedure. I also fear that this syntax is not appropriate for identifying aggregates reliably, ie, aggregate argument lists that look different in this syntax could reduce to identical pg_proc.proargs lists, and perhaps vice versa. I think we should just have it list the arguments as they'd appear in pg_proc, and rely on aggregate properties (to wit, aggkind and aggndirectargs) to identify ordered-set and hypothetical aggregates. A slightly different question is what \da ought to print. I can't say I think that (VARIADIC "any") WITHIN GROUP (*) is going to prove very helpful to users, but probably just (VARIADIC "any") isn't going to do either, at least not unless we add an aggregate-kind column to the printout, and maybe not even then. It might work to cheat by duplicating the last item if it's variadic: (..., VARIADIC "any") WITHIN GROUP (VARIADIC "any") while if it's not variadic, we'd have to work out which argument positions correspond to the ordered-set arguments and put them in the right places. Regardless of that, though ... what is the reasoning for inventing pg_get_aggregate_arguments() rather than just making pg_get_function_arguments() do the right thing? The separate function seems to accomplish little except complicating life for clients, eg in psql's describe.c. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
Álvaro Hernández Tortosa wrote >> Note that you are not required to maintain your configuration data in a >> postgresql.conf-formatted file. You can keep it anywhere you like, GUI >> around in it, and convert it back to the required format. Most of the > > I think it is not a very good idea to encourage GUI tools or tools to > auto-configure postgres to use a separate configuration file and then > convert it to postgresql.conf. That introduces a duplicity with evil > problems if either source of data is modified out-of-the-expected-way. > > That's why I'm suggesting a config file that is, at the same time, > usable by both postgres and other external tools. That also enables > other features such as editing the config file persistently through a > SQL session. For my money I'd rather have a single file and/or directory-structure where raw configuration settings are saved in the current 'key = value' format with simple comments allowed and ignored by PostgreSQL. And being simple key-value the risk of "out-of-the-expected-way" changes would be minimal. If you want to put an example configuration file out there, one that will not be considered to the true configuration, with lots of comments and meta-data then great. I'm hoping that someday there is either a curses-based and even full-fledged GUI that beginners can use to generate the desired configuration. If we want to put a separate "configuration meta-data" file out there to basically provide a database from which third-party tools can pull out this information then great. I would not incorporate that same information into the main PostgreSQL configuration file/directory-structure. The biggest advantage is that the meta-data database can be readily modified without any concern regarding such changes impacting running systems upon update. Then, tools simply need to import "two" files instead of one, link together the meta-data key with the configuration key, and do whatever they were going to do anyway. If indeed that target audience is going to be novices then a static text-based document is not going to be the most desirable interface to present. At worse we should simply include a comment-link at the top of the document to a web-page where an interactive tool for configuration file creation would exist. That tool, at the end of the process, could provide the user with text to copy-paste/save into a specified area on the server so the customizations made would override the installed defaults. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-programmable-file-format-for-postgresql-conf-tp5781097p5782175.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 06/12/13 04:47, Peter Eisentraut wrote: On Thu, 2013-12-05 at 00:51 +0100, Álvaro Hernández Tortosa wrote: The tradeoff seems quite positive to me. I see no strong reasons why not do it... am I missing something? I don't buy your argument. You say, if we make this change, those things will happen. I don't believe it. You can *already* do those things, but no one is doing it. What I've been trying to do is summarize what has already been discussed here and propose a solution. You say that "you can already do those thisngs", but that's not what I have read here. Greg Smith (cc'ed as I'm quoting you) was explaining this in [1]: "Right now, writing such a tool in a generic way gets so bogged down just in parsing/manipulating the postgresql.conf file that it's hard to focus on actually doing the tuning part." And I completely agree. The alternative of having two separate sources of metadata is a very bad solution IMHO, as changes done to the postgresql.conf file directly would completely break the tool used otherwise. And parsing the actual postgresql.conf is simply not enough. First because it's difficult to parse all the comments correctly. Then, because it lacks a lot of the information required for GUI tools and auto-tunning tools. I'm sure you have read the GUCS Overhaul wiki page [2], that already points out many ideas related to this one. But if we make this change, existing users will be inconvenienced, And I somehow agree. Adding some metainformation to the postgresql.conf file may be *a little* bit inconvenient for some users. But those users are probably pgsql-hackers or advanced DBAs. And I'm sure everybody here knows keyboard shortcuts and how to fiddle with larger, yet structured, files. We all know how to grep and sed and awk this files, right? On the other hand, this metainformation would be extremely useful for newbies, not-that-unexperienced DBAs and even users which go to other databases because postgres is hard to configure. Adding it would be extremely valuable for them because: - they would have much more inlined information about the parameter, and - they could use tools to help them with the configuration So the question is: which group of users are we trying to please? And even if the answer would be the pgsql-hackers and not the rest of the world out there, is that much of an inconvenience what I'm saying, to deny the rest of advantages that it may bring? Thanks for your comments, aht [1] http://www.postgresql.org/message-id/pine.gso.4.64.0806020452220.26...@westnet.com [2] http://wiki.postgresql.org/wiki/GUCS_Overhaul -- Álvaro Hernández Tortosa --- NOSYS Networked Open SYStems -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dynamic Shared Memory stuff
On Thu, Dec 5, 2013 at 4:06 PM, Heikki Linnakangas wrote: >> That's a very interesting idea. I've been thinking that we needed to >> preserve the property that new workers could attach to the shared >> memory segment at any time, but that might not be necessary in all >> case. We could introduce a new dsm operation that means "i promise no >> one else needs to attach to this segment". Further attachments would >> be disallowed by dsm.c regardless of the implementation in use, and >> dsm_impl.c would also be given a chance to perform >> implementation-specific operations, like shm_unlink and >> shmctl(IPC_RMID). This new operation, when used, would help to reduce >> the chance of leaks and perhaps catch other programming errors as >> well. >> >> What should we call it? dsm_finalize() is the first thing that comes >> to mind, but I'm not sure I like that. > > dsm_unlink() would mirror the underlying POSIX shm_unlink() call, and would > be familiar to anyone who understands how unlink() on a file works on Unix. OK, let me work on that once this CommitFest is done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
Andres Freund writes: > On 2013-12-06 11:02:48 -0500, Tom Lane wrote: >> I think the special-purpose command line switches you mention can be >> passed through PGOPTIONS, rather than inventing a new parameter -- do you >> have an objection to that? > I am not sure if they currently will get recognized early enough and > whether permission checking will interferes, but if so, that's probably > fixable. Shouldn't be a problem --- the single-user mode will just concatenate the options parameter onto the command line it builds. > There's the question what we're going to end up doing with the current > single user mode? There's some somewhat ugly code around for it... Nothing, in the short term. In a release or two we can get rid of it, probably, but I'd hesitate to provide no overlap at all of these usage modes. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
On 2013-12-06 11:02:48 -0500, Tom Lane wrote: > Andres Freund writes: > My feeling is that we should just treat the executable name and data > directory path as new connection parameters, which'd be ignored in > normal-connection mode, just as some other parameters will be ignored in > single-user mode. Otherwise we'll find ourselves building parameter > setting infrastructure that pretty much duplicates what's there for the > existing connection parameters. Right. > I think the special-purpose command line switches you mention can be > passed through PGOPTIONS, rather than inventing a new parameter -- do you > have an objection to that? I am not sure if they currently will get recognized early enough and whether permission checking will interferes, but if so, that's probably fixable. > > Not sure if we need anything but the pid of the postmaster be returned? > > The new PQconnect routine would certainly hand back a PGconn. I think > we'd need a new query function PQchildPid(PGconn *) or some such to > provide access to the child process PID. I was thinking of a pid_t* argument to the new routine, but it's likely unneccessary as we're probably going to end up storing it in PGconn anyway. There's the question what we're going to end up doing with the current single user mode? There's some somewhat ugly code around for it... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] commit fest 2013-11 week 3 report
Last week's status: Fri Nov 29 Status Summary. Needs Review: 29, Waiting on Author: 33, Ready for Committer: 13, Committed: 24, Returned with Feedback: 5, Rejected: 5. Total: 109. Current status: Fri Dec 6 10:55 Status Summary. Needs Review: 24, Waiting on Author: 21, Ready for Committer: 14, Committed: 26, Returned with Feedback: 16, Rejected: 8. Total: 109. Although this looks like slow progress on paper, many larger patches have actually made significant progress in their reviews this past week. A lot of those will probably be in very good shape for the next commit fest. We're also very close to achieving the primary goal of having given every submission a decent review. Although it looks like Heikki's B-tree patches might end up being the loser ... Now it's crunch time: - If you are a committer, grab a patch from the Ready for Committer list and commit it. - If you are a patch author and you hope to get your patch finished next week, make sure you have a current patch, your commit fest entry is up to date, and your reviewers are aware of your intentions. - If you are a reviewer, grab one of the remaining patches and give it a look. So no one feels sad. - If you are a commit fest manager, all patches not in the above categories should be disposed of appropriately by the end of next week. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_archivecleanup bug
Robert Haas writes: > On Thu, Dec 5, 2013 at 6:15 PM, Tom Lane wrote: >> In general, I think there is no excuse for code in the backend to use >> readdir() directly; it should be using ReadDir(), which takes care of this >> as well as error reporting. > My understanding is that the fd.c infrastructure can't be used in the > postmaster. Say what? See ParseConfigDirectory for code that certainly runs in the postmaster, and uses ReadDir(). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE sampling is too good
It looks like this is a fairly well understood problem because in the real world it's also often cheaper to speak to people in a small geographic area or time interval too. These wikipedia pages sound interesting and have some external references: http://en.wikipedia.org/wiki/Cluster_sampling http://en.wikipedia.org/wiki/Multistage_sampling I suspect the hard part will be characterising the nature of the non-uniformity in the sample generated by taking a whole block. Some of it may come from how the rows were loaded (e.g. older rows were loaded by pg_restore but newer rows were inserted retail) or from the way Postgres works (e.g. hotter rows are on blocks with fewer rows in them and colder rows are more densely packed). I've felt for a long time that Postgres would make an excellent test bed for some aspiring statistics research group. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time-Delayed Standbys
On Fri, Dec 6, 2013 at 1:36 PM, Robert Haas wrote: > On Thu, Dec 5, 2013 at 11:07 PM, Fabrízio de Royes Mello > wrote: > > On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs > wrote: > >> > >> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and > >> > XLOG_XACT_COMMIT_COMPACT checks > >> > >> Why just those? Why not aborts and restore points also? > >> > > > > I think make no sense execute the delay after aborts and/or restore > points, > > because it not change data in a standby server. > > I see no reason to pause for aborts. Aside from the fact that it > wouldn't be reliable in corner cases, as Fabrízio says, there's no > user-visible effect, just as there's no user-visible effect from > replaying a transaction up until just prior to the point where it > commits (which we also do). > > Waiting for restore points seems like it potentially makes sense. If > the standby is delayed by an hour, and you create a restore point and > wait 55 minutes, you might expect that that you can still kill the > standby and recover it to that restore point. > > Makes sense. Fixed. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml index 9d80256..12aa917 100644 --- a/doc/src/sgml/recovery-config.sgml +++ b/doc/src/sgml/recovery-config.sgml @@ -142,6 +142,31 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows + + min_standby_apply_delay (integer) + +min_standby_apply_delay recovery parameter + + + +Specifies the amount of time (in milliseconds, if no unit is specified) +which recovery of transaction commits should lag the master. This +parameter allows creation of a time-delayed standby. For example, if +you set this parameter to 5min, the standby will +replay each transaction commit only when the system time on the standby +is at least five minutes past the commit time reported by the master. + + +Note that if the master and standby system clocks are not synchronized, +this might lead to unexpected results. + + +This parameter works only for streaming replication deployments. Synchronous +replicas and PITR has not affected. + + + + diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample index 5acfa57..e8617db 100644 --- a/src/backend/access/transam/recovery.conf.sample +++ b/src/backend/access/transam/recovery.conf.sample @@ -123,6 +123,17 @@ # #trigger_file = '' # +# min_standby_apply_delay +# +# By default, a standby server keeps restoring XLOG records from the +# primary as soon as possible. If you want to delay the replay of +# commited transactions from the master, specify a recovery time delay. +# For example, if you set this parameter to 5min, the standby will replay +# each transaction commit only when the system time on the standby is least +# five minutes past the commit time reported by the master. +# +#min_standby_apply_delay = 0 +# #--- # HOT STANDBY PARAMETERS #--- diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index b68230d..7ca2f9b 100755 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -218,6 +218,8 @@ static bool recoveryPauseAtTarget = true; static TransactionId recoveryTargetXid; static TimestampTz recoveryTargetTime; static char *recoveryTargetName; +static int min_standby_apply_delay = 0; +static TimestampTz recoveryDelayUntilTime; /* options taken from recovery.conf for XLOG streaming */ static bool StandbyModeRequested = false; @@ -730,6 +732,7 @@ static void readRecoveryCommandFile(void); static void exitArchiveRecovery(TimeLineID endTLI, XLogSegNo endLogSegNo); static bool recoveryStopsHere(XLogRecord *record, bool *includeThis); static void recoveryPausesHere(void); +static void recoveryDelay(void); static void SetLatestXTime(TimestampTz xtime); static void SetCurrentChunkStartTime(TimestampTz xtime); static void CheckRequiredParameterValues(void); @@ -5474,6 +5477,19 @@ readRecoveryCommandFile(void) (errmsg_internal("trigger_file = '%s'", TriggerFile))); } + else if (strcmp(item->name, "min_standby_apply_delay") == 0) + { + const char *hintmsg; + + if (!parse_int(item->value, &min_standby_apply_delay, GUC_UNIT_MS, + &hintmsg)) +ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("parameter \"%s\" requires a temporal value", "min_standby_
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
Andres Freund writes: > On 2013-12-05 23:01:28 +0200, Heikki Linnakangas wrote: >> Right. Not all of the parameters will make sense for a stand-alone backend >> though, like the hostname and port number. And I think you need need a new >> parameter to pass the path to the 'postgres' executable, unless we re-use >> the host parameter for that. > Hm. I'd guessed that we wouldn't use the connection string to pass down > the executable name and the datadir now that we're inventing a separate > function. But maybe that's unneccessary. > What parameters do we require to be set for that mode: > * path to postgres > * data directory > * database name (single mode after all) > * port, because of the shmem key? I'd say that's not important enough > I think we also need to be able to pass some additional parameters to > postgres: > - config_file, hba_file, ... might be required to start pg in some > environments > - -P, -O , are sometimes required in cases single user mode is > neccessary for data recovery. Right, so by the time we're done, we'd still need a connection string or the moral equivalent. My feeling is that we should just treat the executable name and data directory path as new connection parameters, which'd be ignored in normal-connection mode, just as some other parameters will be ignored in single-user mode. Otherwise we'll find ourselves building parameter setting infrastructure that pretty much duplicates what's there for the existing connection parameters. I think the special-purpose command line switches you mention can be passed through PGOPTIONS, rather than inventing a new parameter -- do you have an objection to that? > Not sure if we need anything but the pid of the postmaster be returned? The new PQconnect routine would certainly hand back a PGconn. I think we'd need a new query function PQchildPid(PGconn *) or some such to provide access to the child process PID. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_archivecleanup bug
On Thu, Dec 5, 2013 at 6:15 PM, Tom Lane wrote: > In general, I think there is no excuse for code in the backend to use > readdir() directly; it should be using ReadDir(), which takes care of this > as well as error reporting. My understanding is that the fd.c infrastructure can't be used in the postmaster. I agree that sucks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time-Delayed Standbys
On Thu, Dec 5, 2013 at 11:07 PM, Fabrízio de Royes Mello wrote: > On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs wrote: >> >> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and >> > XLOG_XACT_COMMIT_COMPACT checks >> >> Why just those? Why not aborts and restore points also? >> > > I think make no sense execute the delay after aborts and/or restore points, > because it not change data in a standby server. I see no reason to pause for aborts. Aside from the fact that it wouldn't be reliable in corner cases, as Fabrízio says, there's no user-visible effect, just as there's no user-visible effect from replaying a transaction up until just prior to the point where it commits (which we also do). Waiting for restore points seems like it potentially makes sense. If the standby is delayed by an hour, and you create a restore point and wait 55 minutes, you might expect that that you can still kill the standby and recover it to that restore point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
"MauMau" writes: > That discussion sounds interesting, and I want to take more time to > consider. But what do you think of my original suggestion to easily solve > the current issue? I'd like to remove the current annoying problem first > before spending much time for more excited infrastructure. There is no enthusiasm for a quick-hack solution here, and most people don't actually agree with your proposal that these errors should never get logged. So no, that is not happening. You can hack your local copy that way if you like of course, but it's not getting committed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
On 2013-12-05 23:01:28 +0200, Heikki Linnakangas wrote: > On 12/05/2013 10:37 PM, Robert Haas wrote: > >On Thu, Dec 5, 2013 at 3:05 PM, Tom Lane wrote: > >>It might be unpleasant to use in some cases, though. > > > >Why would there be more than a few cases in the first place? Who is > >going to use this beyond psql, pg_dump(all), and pg_upgrade, and why? > > Well, you might want to use pgAdmin, or your other favorite admin tool. I'm > not sure how well it would work, and I think it's OK if we say "sorry, can't > do that", but it's not a crazy thing to want. Pgadmin wouldn't work, it uses multiple connections for anything but the most trivial tasks. You can't even send a manual sql query using only one connection. I think that's true for most of the non-trivial tools. > >>Another issue is that we have too many variants of PQconnect > >>already; which of them are we prepared to clone for this > >>hypothetical new connection method? > > > >PQconnectdbParams, I assume. Isn't that the one to rule them all, > >modulo async connect which I can't think is relevant here? > Right. Not all of the parameters will make sense for a stand-alone backend > though, like the hostname and port number. And I think you need need a new > parameter to pass the path to the 'postgres' executable, unless we re-use > the host parameter for that. Hm. I'd guessed that we wouldn't use the connection string to pass down the executable name and the datadir now that we're inventing a separate function. But maybe that's unneccessary. What parameters do we require to be set for that mode: * path to postgres * data directory * database name (single mode after all) * port, because of the shmem key? I'd say that's not important enough I think we also need to be able to pass some additional parameters to postgres: - config_file, hba_file, ... might be required to start pg in some environments - -P, -O , are sometimes required in cases single user mode is neccessary for data recovery. So I think we should just allow passing through arguments to postgres. Not sure if we need anything but the pid of the postmaster be returned? > >Or don't clone that one but instead have > >PQnextConnectionShouldForkThisBinary('...') and let the psql/pg_dump > >switch be --standalone=full-path-to-the-postgres-binary. Yuck, that's ugly. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review: [HACKERS] ECPG infrastructure changes part 1, was: Re: ECPG fixes
Tested git apply and build again. No warnings. The regression test also looks good to me now. I'm done with this review. (Not sure if I should move it to 'ready for committer' status or the CFM should do). // Antonin Houska (Tony) On 12/06/2013 02:01 PM, Boszormenyi Zoltan wrote: > 2013-12-04 14:51 keltezéssel, Boszormenyi Zoltan írta: >> 2013-12-03 16:48 keltezéssel, Antonin Houska írta: >> >>> Tests - 23.patch >>> >>> >>> src/interfaces/ecpg/test/sql/cursorsubxact.pgc >>> >>> >>> /* >>> * Test the implicit RELEASE SAVEPOINT if a SAVEPOINT >>> * is used with an already existing name. >>> */ >>> >>> Shouldn't it be "... if a CURSOR is used with an already existing >>> name?". Or just "... implicit RELEASE SAVEPOINT after an error"? >>> I'd also appreciate a comment where exactly the savepoint is >>> (implicitly) released. >> >> I have already answered this in my previous answer. > > And I was wrong in that. The comments in the test were rearranged > a little and the fact in the above comment is now actually tested. > > Some harmless unused variables were also removed and an > uninitialized variable usage was fixed. Because of these and the above > changes a lot of patches need to be rebased. > > All patches are attached again for completeness. > > Best regards, > Zoltán Böszörményi > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add %z support to elog/ereport?
On 11/11/13, 12:01 PM, Tom Lane wrote: > I do recall Peter saying that the infrastructure knows how to > verify conversion specs in translated strings, so it would have to be > aware of 'z' flags for this to work. It just checks that the same conversion placeholders appear in the translation. It doesn't interpret the actual placeholders. I don't think this will be a problem. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Trust intermediate CA for client certificates
On Mon, Dec 2, 2013 at 12:44:08PM -0500, Bruce Momjian wrote: > On Sat, Nov 30, 2013 at 12:10:19PM -0500, Bruce Momjian wrote: > > > Drat, you're quite right. I've always included the full certificate > > > chain in client certs but it's in no way required. > > > > > > I guess that pretty much means maintaining the status quo and documenting > > > it better. > > > > I have developed the attached patch to document this behavior. My goals > > were: > > > > * clarify that a cert can match a remote intermediate or root certificate > > * clarify that the client cert must match a server root.crt > > * clarify that the server cert much match a client root.crt > > * clarify that the root certificate does not have to be specified > > in the client or server cert as long as the remote end has the chain > > to the root > > > > Does it meet these goals? Is it correct? > > I have updated the patch, attached, to be clearer about the requirement > that intermediate certificates need a chain to root certificates. Patch applied to head. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> I believe that the spec requires that the "direct" arguments of > Tom> an inverse or hypothetical-set aggregate must not contain any > Tom> Vars of the current query level. > In any event, going by the docs on the web, Oracle does not forbid > grouped columns there (their wording is "This expr must be constant > within each aggregation group.") MSSQL seems to require a literal > constant, but that's obviously not per spec. IBM doesn't seem to > have it in db2 for linux, but some of their other products have it > and include examples of using grouped vars: see > http://pic.dhe.ibm.com/infocenter/ntz/v7r0m3/index.jsp?topic=%2Fcom.ibm.nz.dbu.doc%2Fc_dbuser_hypothetical_set_family_syntax.html OK, that's reasonably convincing. I think we'll need a HINT or something to clarify the error message, because it sure looks like those arguments are "used in an aggregate function". > Tom> 2. For an ordered set function, n must equal aggnfixedargs. We > Tom> treat all n fixed arguments as contributing to the aggregate's > Tom> result collation, but ignore the sort arguments. > That doesn't work for getting a sensible collation out of > percentile_disc applied to a collatable type. (Which admittedly is an > extension to the spec, which allows only numeric and interval, but it > seems to me to be worth having.) Meh. I don't think you can have that and also have the behavior that multiple ORDER BY items aren't constrained to have the same collation; at least not without some rule that amounts to a special case for percentile_disc, which I'd resist. > Tom> 3. For a hypothetical-set function, n must equal aggnfixedargs > Tom> plus k, and we match up type and collation info of the last k > Tom> fixed arguments with the corresponding sort arguments. The > Tom> first n-k fixed arguments contribute to the aggregate's result > Tom> collation, the rest not. > The submitted patch does essentially this but taking the number of > non-variadic args in place of the suggested aggnfixedargs. Presumably > in your version the latter would be derived from the former? I'm not on board with using variadic vs non variadic to determine this. For example, imagine a hypothetical-set function that for some reason supports only a single sort column; there would be no reason to use VARIADIC in its definition, and indeed good reason not to. In any case, I don't think this behavior should be tied to implementation details of the representation of the function signature, and IMV variadic is just that --- particularly for VARIADIC ANY, which is nothing more than a short-cut for overloading the function name with different numbers of ANY arguments. Once we've got a match that involves N direct arguments and K ordering arguments, the behavior should be determinate without respect to just how we got that match. > Tom> Reading back over this email, I see I've gone back and forth > Tom> between using the terms "direct args" and "fixed args" for the > Tom> evaluate-once stuff to the left of WITHIN GROUP. I guess I'm > Tom> not really sold on either terminology, but we need something we > Tom> can use consistently in the code and docs. The spec is no help, > Tom> it has no generic term at all for these args. Does anybody else > Tom> have a preference, or maybe another suggestion entirely? > We (Atri and I) have been using "direct args", but personally I'm not > amazingly happy with it. Documentation for other dbs tends to just call > them "arguments", and refer to the WITHIN GROUP expressions as "ordering > expressions" or similar. Well, given that I was mistaken to think there could be no Vars at all in them, "fixed" may not be le mot juste. Unless somebody's got an alternative to "direct", let's go with that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reference to parent query from ANY sublink
Antonin Houska wrote: > SELECT * > FROM tab1 a > LEFT JOIN > tab2 b > ON a.i = ANY ( > SELECT k > FROM tab3 c > WHERE k = a.i); This query works with k in any or all tables, but the semantics certainly vary depending on where k happens to be. It would help a lot if you showed SQL statements to create and populate the tables involved and/or if you qualified all referenced column names with the table alias to avoid ambiguity. If I assume that the k reference is supposed to be a column in tab3, what you have is a query where you always get all rows from tab1, and for each row from tab1 you either match it to all rows from tab2 or no rows from tab2 depending on whether the tab1 row has a match in tab3. test=# create table tab1 (i int); CREATE TABLE test=# create table tab2 (j int); CREATE TABLE test=# create table tab3 (k int); CREATE TABLE test=# insert into tab1 values (1), (2), (3); INSERT 0 3 test=# insert into tab2 values (4), (5), (6); INSERT 0 3 test=# insert into tab3 values (1), (3); INSERT 0 2 test=# SELECT * FROM tab1 a LEFT JOIN tab2 b ON a.i = ANY ( SELECT k FROM tab3 c WHERE k = a.i); i | j ---+--- 1 | 4 1 | 5 1 | 6 2 | 3 | 4 3 | 5 3 | 6 (7 rows) > SELECT * > FROM tab1 a > LEFT JOIN > ( > SELECT * > tab2 b > SEMI JOIN > ( SELECT k > FROM tab3 c > WHERE k = a.i > ) AS ANY_subquery > ON a.i = ANY_subquery.k > ) AS SJ_subquery > ON true; It is hard to see what you intend here, since this is not valid syntax. I assume you want a FROM keyword before the tab2 reference, but it's less clear what you intend with the SEMI JOIN syntax. PostgreSQL supports semi-joins; but that is an implementation detail for the EXISTS or IN syntax. Could you clarify your intent? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [patch] Client-only installation on Windows
Hello, According to this page, http://www.postgresql.org/docs/current/static/install-procedure.html client-only installation is possible on UNIX/Linux like this: gmake -C src/bin install gmake -C src/include install gmake -C src/interfaces install gmake -C doc install With the attached patch, you can do client-only installation on Windows like this: install.bat client This installs: * client applications (both core and contrib) * DLLs for libpq and ECPG * header files * import libraries * pg_service.conf.sample and psqlrc.sample * symbol files (*.pdb) for the above modules If the second argument is given as "client" or omitted, all files are installed. With 9.4, the whole installation takes up about 80 MB, and the client-only installation takes up only 24 MB. Any comments would be appreciated. Regards MauMau client_only_install_win.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Alvaro Herrera" There was also the idea that this would be driven off SQLSTATE but this seems pretty unwieldy to me. You are referring to this long discussion, don't you? http://www.postgresql.org/message-id/19791.1335902...@sss.pgh.pa.us I've read it before, and liked the SQLSTATE-based approach. It seems like properly assigned SQLSTATEs can be used as message IDs, and pairs of SQLSTATE and its user action might be utilized to provide sophisticated database administration GUI. That discussion sounds interesting, and I want to take more time to consider. But what do you think of my original suggestion to easily solve the current issue? I'd like to remove the current annoying problem first before spending much time for more excited infrastructure. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature request: Logging SSL connections
Anything else missing? Functionally it's fine now, but I see few style problems: - "if (port->ssl > 0)" is wrong, ->ssl is pointer. So use just "if (port->ssl)". - Deeper indentation would look nicer with braces. - There are some duplicated message, could you restructure it so that each message exists only once. New version is attached. I could add braces before and after the ereport()-calls too, but then I need two more #ifdefs to catch the closing braces. -- --- __ Dr. Andreas Kunert / __/ / / / __/ HU-Berlin, ZE Rechenzentrum (CMS) / /_ / / / / __\\ www.hu-berlin.de/~kunert /___/ /_/_/_/ /___/ --- --- postinit.c.orig 2013-12-06 14:42:14.827832432 +0100 +++ postinit.c 2013-12-06 14:42:25.171842695 +0100 @@ -221,13 +221,31 @@ if (Log_connections) { if (am_walsender) - ereport(LOG, - (errmsg("replication connection authorized: user=%s", - port->user_name))); + { +#ifdef USE_SSL + if (port->ssl) +ereport(LOG, + (errmsg("replication connection authorized: user=%s SSL(protocol: %s, cipher: %s)", +port->user_name, SSL_get_version(port->ssl), SSL_get_cipher(port->ssl; + else +#endif +ereport(LOG, + (errmsg("replication connection authorized: user=%s", +port->user_name))); + } else - ereport(LOG, - (errmsg("connection authorized: user=%s database=%s", - port->user_name, port->database_name))); + { +#ifdef USE_SSL + if (port->ssl) +ereport(LOG, + (errmsg("connection authorized: user=%s database=%s SSL(protocol: %s, cipher: %s)", +port->user_name, port->database_name, SSL_get_version(port->ssl), SSL_get_cipher(port->ssl; + else +#endif +ereport(LOG, + (errmsg("connection authorized: user=%s database=%s", +port->user_name, port->database_name))); + } } set_ps_display("startup", false); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 2013-12-06 22:35:21 +0900, MauMau wrote: > From: "Tom Lane" > >No. They are FATAL so far as the individual session is concerned. > >Possibly some documentation effort is needed here, but I don't think > >any change in the code behavior would be an improvement. > > You are suggesting that we should add a note like "Don't worry about the > following message. This is a result of normal connectivity checking", don't > you? > > FATAL: the database system is starting up Uh. An explanation why you cannot connect to the database hardly seems like a superflous log message. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Josh Berkus" Heck, I'd be happy just to have a class of messages which specifically means "OMG, there's something wrong with the server", that is, a flag for messages which only occur when PostgreSQL encounters a bug, data corrpution, or platform error. Right now, I have to suss those out by regex. What are the issues of using SQLSTATE XXnnn as a filter? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Tom Lane" No. They are FATAL so far as the individual session is concerned. Possibly some documentation effort is needed here, but I don't think any change in the code behavior would be an improvement. You are suggesting that we should add a note like "Don't worry about the following message. This is a result of normal connectivity checking", don't you? FATAL: the database system is starting up But I doubt most users would recognize such notes. Anyway, lots of such messages certainly make monitoring and troubleshooting harder, because valuable messages are buried. 4. FATAL: sorry, too many clients already Report these as FATAL to the client because the client wants to know the reason. But don't output them to server log because they are not necessary for DBAs (4 is subtle.) The notion that a DBA should not be allowed to find out how often #4 is happening is insane. I thought someone would point out so. You are right, #4 is a strong hint for the DBA about max_connection setting or connection pool configuration. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature request: Logging SSL connections
On Fri, Dec 06, 2013 at 11:43:55AM +0100, Dr. Andreas Kunert wrote: > >>That seems useful. Do we need more information, like whether a client > >>certificate was presented, or what ciphers were used? > > > >Yes, please show ciphersuite and TLS version too. Andreas, you can use my > >recent \conninfo patch as template: > > > > > > https://github.com/markokr/postgres/commit/7d1b27ac74643abd15007cc4ec0b56ba92b39d90 > > > >Also, please show the SSL level also for walsender connections. It's > >quite important to know whether they are using SSL or not. > > > >But I think the 'bits' output is unnecessary, as it's cipher strength > >is known by ciphersuite. Perhaps it can be removed from \conninfo too. > > A new patch is attached. I added the ciphersuite and TLS version > like shown in your template (minus the 'bits' output). I also added > the SSL information for walsender connections, but due to a missing > test setup I cannot test that part. > > Anything else missing? Functionally it's fine now, but I see few style problems: - "if (port->ssl > 0)" is wrong, ->ssl is pointer. So use just "if (port->ssl)". - Deeper indentation would look nicer with braces. - There are some duplicated message, could you restructure it so that each message exists only once. Something like this perhaps: #if USE_SSL if (port->ssl) { if (walsender) .. else .. } else #endif ... -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Peter Eisentraut" Yeah, this is part of a more general problem, which you have characterized correctly: What is fatal (or error, or warning, ...) to the client isn't necessarily fatal (or error, or warning, ...) to the server or DBA. Thanks. In addition, #5 and #6 in my previous mail are even unnecessary for both the client and the DBA, aren't they? Fixing this would need a larger enhancement of the logging infrastructure. It's been discussed before, but it's a bit of work. How about the easy fix I proposed? The current logging infrastructure seems enough to solve the original problem with small effort without complicating the code. If you don't like "log_min_messages = PANIC", SetConfigOption() can be used instead. I think we'd better take a step to eliminate the facing problem, as well as consider a much richer infrastracture in the long run. I'm also interested in the latter, and want to discuss it after solving the problem in front of me. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared memory message queues
On 2013-12-05 14:07:39 -0500, Robert Haas wrote: > On Thu, Dec 5, 2013 at 8:29 AM, Andres Freund wrote: > > Hm. The API change of on_shmem_exit() is going to cause a fair bit of > > pain. There are a fair number of extensions out there using it and all > > would need to be littered by version dependent #ifdefs. What about > > providing a version of on_shmem_exit() that allows to specify the exit > > phase, and make on_shmem_exit() a backward compatible wrapper? > > Or, we could have on_user_exit() or so and leave on_shmem_exit() alone > altogether, which would probably be transparent for nearly everyone. > I kind of like that better, except that I'm not sure whether > on_user_exit() is any good as a name. Leaving it as separate calls sounds good to me as well - but like you I don't like on_user_exit() that much. Not sure if I can up with something really good... on_shmem_exit_phase() or at_shmem_exit_phase() if we go for a function allowing to specify phases, and just before_shmem_exit() for the "user level" things? > > FWIW, I find on_dsm_detach_cancel() to be a confusing name. I thought it > > might be a variant that is called in different circumstances than > > on_dsm_detach(). Maybe go with cancel_on_shmem_detach(), like with > > cancel_shmem_exit()? > > It could be cancel_on_dsm_detach() if you like that better. Not > cancel_on_shmem_detach(), though. Yes, I like that better. The shm instead of dsm was just a thinko ,) > > Hm. A couple of questions/comments: > > * How do you imagine keys to be used/built? > > * Won't the sequential search over toc entries become problematic? > > * Some high level overview of how it's supposed to be used wouldn't > > hurt. > > * the estimator stuff doesn't seem to belong in the public header? > > The test-shm-mq patch is intended to illustrate these points. In that > case, for an N-worker configuration, there are N+1 keys; that is, N > message queues and one fixed-size control area. The estimator stuff > is critically needed in the public header so that you can size your > DSM to hold the stuff you intend to store in it; again, see > test-shm-mq. I still think shm_mq.c needs to explain more of that. That's where I look for a brief high-level explanation, no? > >> Patch #3, shm-mq-v1.patch, is the heart of this series. It creates an > >> infrastructure for sending and receiving messages of arbitrary length > >> using ring buffers stored in shared memory (presumably dynamic shared > >> memory, but hypothetically the main shared memory segment could be > >> used). > > > > The API seems to assume it's in dsm tho? > > The header file makes no reference to dsm anywhere, so I'm not sure > why you draw this conclusion. The reason I was asking was this reference to dsm: +shm_mq_handle * +shm_mq_attach(shm_mq *mq, dsm_segment *seg, BackgroundWorkerHandle *handle) I'd only looked at the header at that point, but looking at the function's comment it's otional. > > Hm. Too bad, I'd hoped for single-reader, multiple-writer :P > > Sure, that might be useful, too, as might multiple-reader, > multi-writer. But those things would come with performance and > complexity trade-offs of their own. I think it's appropriate to leave > the task of creating those things to the people that need them. If it > turns out that this can be enhanced to work like that without > meaningful loss of performance, that's fine by me, too, but I think > this has plenty of merit as-is. Yea, it's perfectly fine not to implement what I wished for ;). I just had hoped you would magically develop what I was dreaming about... > It's symmetric. The idea is that you try to read or write data; > should that fail, you wait on your latch and try again when it's set. Ok, good. That's what I thought. > > Couple of questions: > > * Some introductory comments about the concurrency approach would be > > good. > > Not sure exactly what to write. I had a very quick look at shm_mq_receive()/send() and shm_mq_receive_bytes()/shm_mq_send_bytes() - while the low level steps seem to be explained fairly well, the high level design of the queue doesn't seem to be explained much. I was first thinking that you were trying to implement a lockless queue (which is quite efficiently possible for 1:1 queues) even because I didn't see any locking in them till I noticed it's just delegated to helper functions. > > * shm_mq_attach() is talking about BackgroundWorkerHandles - is there a > > limitation that a bgworker has to be involved? > [sensible stuff] > > Possibly there could be some similar mechanism to wait for a > non-background-worker to stop, but I haven't thought much about what > that would look like. I wonder if we couldn't just generally store a "generation" number for each PGPROC which is increased everytime the slot is getting reused. Then one could simply check whether mq->mq_sender->generation == mq->mq_sender_generation. Greetings, Andres Freund -- Andres Freund ht
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On Fri, Dec 6, 2013 at 3:39 PM, Haribabu kommi wrote: > On 06 December 2013 12:29 Amit Kapila wrote: >> >> Note - >> >> a. Performance is data is taken on my laptop, needs to be tested on >> >> some better m/c b. Attached Patch is just a prototype of chunkwise >> >> concept, code needs to be improved and decode >> >> handling/test is pending. >> > >> > I ran the performance test on linux machine and attached the results >> in the mail. >> > > I ran the performance test on above patches including another patch which > Does snappy hash instead of normal hash in LZ algorithm. The performance > Readings and patch with snappy hash not including new data in compression > are attached in the mail. Thanks for taking the data. > The chunk wise approach is giving good performance in most of the scenarios. Agreed, summarization of data for LZ/Chunkwise encoding especially for non-compressible (hundred tiny fields, all changed/half changed) or less compressible data (hundred tiny fields, half nulled) w.r.t CPU usage is as below: a. For hard disk, there is an overhead of 7~16% with LZ delta encoding and there is an overhead of 5~8% with Chunk wise encoding. b. For Tempfs (which means operate on RAM as disk), there is an overhead of 19~26% with LZ delta encoding and there is an overhead of 9~18% with Chunk wise encoding. There might be some variation of data (in your last mail the overhead for chunkwise method for Tempfs was < 12%), but in general the data suggests that chunk wise encoding has less overhead than LZ encoding for non-compressible data and for others it is better or equal. Now, I think we have below options for this patch: a. If the performance overhead for worst case is acceptable (we can try to reduce some more, but don't think it will be something drastic), then this can be done without any flag/option. b. Have it with table level option to enable/disable WAL compression c. Drop this patch, as for worst cases there is some performance overhead. d. Go back and work more on it, if there is any further suggestions for improvement. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink performance regression
On Fri, Dec 6, 2013 at 2:50 AM, Joe Conway wrote: > > On 12/05/2013 07:16 PM, Fabrízio de Royes Mello wrote: > > Hi Joe, how are you? > > Hi Fabrizio -- great to hear from you! I'm well. > :-) > > Well, when Tom sent this email I was reviewing your patch and the > > main suggestion is about use of 'pg_encoding_to_char' too... ;-) > > > > The attached patch with my review! > > Awesome, thanks for the review! > You're welcome! Greetings, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] [PATCH] Add transforms feature
On 12/06/2013 07:25 AM, Peter Eisentraut wrote: > On Tue, 2013-11-26 at 18:10 +0100, Dimitri Fontaine wrote: >> The problem is installing a set of extensions where some of them are >> already using the new transform feature and some of them are not. We >> need a way to cater with that, I think. > Here is an idea. Add a GUC that basically says something like > use_transforms = on|off. You can then attach that to individual > functions, which is the right granularity, because only the function > knows whether its code expects transforms or not. But you can use the > full power of GUC to configure it any way you want. It would requite the "old" extensions to be modified to have (SET use_transforms = off) in all their definitions so that they would not accidentally be called with use_transforms = on from "new" functions, but else it seems like a good way to get it done without too much effort. > The only thing this doesn't give you is per-argument granularity, but I > think the use cases for that are slim, and we don't have a good existing > mechanism to attach arbitrary attributes to function arguments. Agreed. And we are quite unlikely to need multiple transforms for the same type in the same language. > Actually, I'd take this two steps further. > > First, make this parameter per-language, so something like > plpython.use_transforms. Then it's up to the language implementation > how they want to deal with this. A future new language could just > ignore the whole issue and require transforms from the start. I do not really see much need for this, as it will need to be set for each individual function anyway. Actually what we could do is just declare a new "language" for this so functions declared with "LANGUAGE plpythonu" will not be using transforms and those with "LANGUAGE plpythonuxf" will use it. This would only need one extra function to be defined in source code, namely the compile function for the "new" language". Some not-transforms-related wild ideas follow :) Adding a new language would also be a good way to fix the bad syntax choices in pl/python which require code manipulation before compiling . I came up with this idea after seeing how pl/jsv8 supports multiple JavaScript-based languages (standard JavaScript, CoffeeScript, LiveScript) from the same codebase. Taking the plv8 ideas further we could also create a JavaScript-based "sandboxed python" using thins like skulpt and pyjamas which compile python source code to JavaScript VM and inherit all the sandboxing of v8. Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature request: Logging SSL connections
That seems useful. Do we need more information, like whether a client certificate was presented, or what ciphers were used? Yes, please show ciphersuite and TLS version too. Andreas, you can use my recent \conninfo patch as template: https://github.com/markokr/postgres/commit/7d1b27ac74643abd15007cc4ec0b56ba92b39d90 Also, please show the SSL level also for walsender connections. It's quite important to know whether they are using SSL or not. But I think the 'bits' output is unnecessary, as it's cipher strength is known by ciphersuite. Perhaps it can be removed from \conninfo too. A new patch is attached. I added the ciphersuite and TLS version like shown in your template (minus the 'bits' output). I also added the SSL information for walsender connections, but due to a missing test setup I cannot test that part. Anything else missing? -- Andreas --- postinit.c.orig 2013-12-06 10:26:47.773341120 +0100 +++ postinit.c 2013-12-06 10:37:30.185894650 +0100 @@ -220,6 +220,26 @@ if (Log_connections) { +#ifdef USE_SSL + if (am_walsender) + if (port->ssl > 0) +ereport(LOG, + (errmsg("replication connection authorized: user=%s SSL(protocol: %s, cipher: %s)", +port->user_name, SSL_get_version(port->ssl), SSL_get_cipher(port->ssl; + else +ereport(LOG, + (errmsg("replication connection authorized: user=%s", +port->user_name))); + else + if (port->ssl > 0) +ereport(LOG, + (errmsg("connection authorized: user=%s database=%s SSL(protocol: %s, cipher: %s)", +port->user_name, port->database_name, SSL_get_version(port->ssl), SSL_get_cipher(port->ssl; + else +ereport(LOG, + (errmsg("connection authorized: user=%s database=%s", +port->user_name, port->database_name))); +#else if (am_walsender) ereport(LOG, (errmsg("replication connection authorized: user=%s", @@ -228,6 +248,7 @@ ereport(LOG, (errmsg("connection authorized: user=%s database=%s", port->user_name, port->database_name))); +#endif } set_ps_display("startup", false); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add transforms feature
Peter Eisentraut writes: > Here is an idea. Add a GUC that basically says something like > use_transforms = on|off. You can then attach that to individual > functions, which is the right granularity, because only the function > knows whether its code expects transforms or not. But you can use the > full power of GUC to configure it any way you want. +1 > The only thing this doesn't give you is per-argument granularity, but I > think the use cases for that are slim, and we don't have a good existing > mechanism to attach arbitrary attributes to function arguments. +1 > Actually, I'd take this two steps further. > > First, make this parameter per-language, so something like > plpython.use_transforms. Then it's up to the language implementation > how they want to deal with this. A future new language could just > ignore the whole issue and require transforms from the start. I'm not sure about this level of granularity, but why not. > Second, depending on the choice of the language, this parameter could > take three values: ignore | if available | require. That would allow > users to set various kinds of strictness, for example if they want to be > alerted that a language cannot deal with a particular type. My understanding is that it always can deal with any particular type if you consider text based input/output, right? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] spinlocks storm bug
2013/12/6 Andres Freund > On 2013-12-06 07:22:27 +0100, Pavel Stehule wrote: > > I have a report of critical bug (database is temporary unavailability .. > > restart is necessary). > > > PostgreSQL 9.2.4, > > 24 CPU > > 140G RAM > > SSD disc for all > > > > > > Database is under high load. There is a few databases with very high > number > > of similar simple statements. When application produce higher load, then > > number of active connection is increased to 300-600 about. > > > > In some moment starts described event - there is a minimal IO, all CPU > are > > on 100%. > > > > Perf result shows: > >354246.00 93.0% s_lock > > /usr/lib/postgresql/9.2/bin/postgres > > 10503.00 2.8% LWLockRelease > > /usr/lib/postgresql/9.2/bin/postgres > > 8802.00 2.3% LWLockAcquire > > > We try to limit a connection to 300, but I am not sure if this issue is > not > > related to some Postgres bug. > > We've seen this issue repeatedly now. None of the times it turned out to > be a bug, but just limitations in postgres' scalability. If you can I'd > strongly suggest trying to get postgres binaries compiled with > -fno-omit-frame-pointer installed to check which locks are actually > conteded. > My bet is BufMappingLock. > > There's a CF entry about changing our lwlock implementation to scale > better... > > one missing info - the customer's staff reduced shared buffers from 30G to 5G without success. A database is 20G about. Regards Pavel > Greetings, > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan
I wrote: > Amit Khandekar wrote: > > Yes, I agree that rather than looking at the bitmap heap scan to track > > the number of pages, we should look somewhere in the underlying index > > scan. Yes, we should get a constant number of index pages regardless > > of the actual parent table rows. > I agree with you. I'll modify the patch to show 1) the number of the > exact/lossy pages in a TIDBitmap by examining the underlying index scan, > not the number of these pages that have been fetched in the bitmap heap > scan, and 2) the memory requirement. Though at first I agreed on this, while working on this I start to think information about (2) is enough for tuning work_mem. Here are examples using a version under development, where "Bitmap Memory Usage" means (peak) memory space used by a TIDBitmap, and "Desired" means the memory required to guarantee non-lossy storage of a TID set, which is shown only when the TIDBitmap has been lossified. (work_mem = 1MB.) postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.0001 and 0.0005 ; QUERY PLAN - Bitmap Heap Scan on demo (cost=77.14..12142.69 rows=3581 width=42) (actual time=1.748..53.203 rows=4112 loops=1) Recheck Cond: ((col2 >= 0.0001::double precision) AND (col2 <= 0.0005::double precision)) Bitmap Memory Usage: 315kB -> Bitmap Index Scan on demo_col2_idx (cost=0.00..76.25 rows=3581 width=0) (actual time=1.113..1.113 rows=4112 loops=1) Index Cond: ((col2 >= 0.0001::double precision) AND (col2 <= 0.0005::double precision)) Total runtime: 53.804 ms (6 rows) postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and 0.05 ; QUERY PLAN - Bitmap Heap Scan on demo (cost=8307.41..107635.14 rows=391315 width=42) (actual time=84.818..2709.015 rows=400172 loops=1) Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.05::double precision)) Rows Removed by Index Recheck: 8815752 Bitmap Memory Usage: 1025kB (desired 20573kB) -> Bitmap Index Scan on demo_col2_idx (cost=0.00..8209.58 rows=391315 width=0) (actual time=83.664..83.664 rows=400172 loops=1) Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.05::double precision)) Total runtime: 2747.088 ms (7 rows) We should look at (1) as well? (Honestly, I don't know what to show about (1) when using a bitmap scan on the inside of a nestloop join. For memory usage and desired memory I think the maximum values would be fine.) I re-wish to know your opinion. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] spinlocks storm bug
On 2013-12-06 07:22:27 +0100, Pavel Stehule wrote: > I have a report of critical bug (database is temporary unavailability .. > restart is necessary). > PostgreSQL 9.2.4, > 24 CPU > 140G RAM > SSD disc for all > > > Database is under high load. There is a few databases with very high number > of similar simple statements. When application produce higher load, then > number of active connection is increased to 300-600 about. > > In some moment starts described event - there is a minimal IO, all CPU are > on 100%. > > Perf result shows: >354246.00 93.0% s_lock > /usr/lib/postgresql/9.2/bin/postgres > 10503.00 2.8% LWLockRelease > /usr/lib/postgresql/9.2/bin/postgres > 8802.00 2.3% LWLockAcquire > We try to limit a connection to 300, but I am not sure if this issue is not > related to some Postgres bug. We've seen this issue repeatedly now. None of the times it turned out to be a bug, but just limitations in postgres' scalability. If you can I'd strongly suggest trying to get postgres binaries compiled with -fno-omit-frame-pointer installed to check which locks are actually conteded. My bet is BufMappingLock. There's a CF entry about changing our lwlock implementation to scale better... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup throttling
Hi, 2013-12-05 15:36 keltezéssel, Antonin Houska írta: On 12/02/2013 02:23 PM, Boszormenyi Zoltan wrote: Hi, I am reviewing your patch. Thanks. New version attached. I have reviewed and tested it and marked it as ready for committer. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE sampling is too good
On 2013-12-05 17:52:34 -0800, Peter Geoghegan wrote: > Has anyone ever thought about opportunistic ANALYZE piggy-backing on > other full-table scans? That doesn't really help Greg, because his > complaint is mostly that a fresh ANALYZE is too expensive, but it > could be an interesting, albeit risky approach. What I've been thinking of is a) making it piggy back on scans vacuum is doing instead of doing separate ones all the time (if possible, analyze needs to be more frequent). Currently with quite some likelihood the cache will be gone again when revisiting. b) make analyze incremental. In lots of bigger tables most of the table is static - and we actually *do* know that, thanks to the vm. So keep a rawer form of what ends in the catalogs around somewhere, chunked by the region of the table the statistic is from. Everytime a part of the table changes, re-sample only that part. Then recompute the aggregate. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE sampling is too good
On Fri, Dec 6, 2013 at 7:22 AM, Peter Geoghegan wrote: > On Thu, Dec 5, 2013 at 3:50 PM, Josh Berkus wrote: >> There are fairly well researched algorithms for block-based sampling >> which estimate for the skew introduced by looking at consecutive rows in >> a block. In general, a minimum sample size of 5% is required, and the >> error is no worse than our current system. However, the idea was shot >> down at the time, partly because I think other hackers didn't get the math. > > I think that this certainly warrants revisiting. The benefits would be > considerable. > > Has anyone ever thought about opportunistic ANALYZE piggy-backing on > other full-table scans? That doesn't really help Greg, because his > complaint is mostly that a fresh ANALYZE is too expensive, but it > could be an interesting, albeit risky approach. Is only fresh ANALYZE costly or consecutive one's are also equally costly? Doing it in some background operation might not be a bad idea, but doing it in backend query execution (seq scan) might add overhead for query response time especially if part or most of data for table is in RAM, so here overhead due to actual read might not be very high but the calculation for analyse (like sort) will make it costly. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
> "Tom" == Tom Lane writes: Tom> Further questions about WITHIN GROUP: Tom> I believe that the spec requires that the "direct" arguments of Tom> an inverse or hypothetical-set aggregate must not contain any Tom> Vars of the current query level. False. The spec requires that the direct arguments must not contain _ungrouped_ columns (see ), but nowhere are grouped columns forbidden. Tom> They don't manage to say that in plain English, of course, but Tom> in the case the behavior is defined Tom> in terms of this sub-select: Tom> ( SELECT 0, SK1, ..., SKK Tom> FROM TNAME UNION ALL Tom> VALUES (1, VE1, ..., VEK) ) "TNAME" there is not the input table or an alias for it, but rather the specific subset of rows to which the grouping operation is being applied (after applying a FILTER if any). We're in the General Rules here, not the Syntax Rules, so this is describing _how to compute the result_ rather than a syntactic transformation of the input. In any event, going by the docs on the web, Oracle does not forbid grouped columns there (their wording is "This expr must be constant within each aggregation group.") MSSQL seems to require a literal constant, but that's obviously not per spec. IBM doesn't seem to have it in db2 for linux, but some of their other products have it and include examples of using grouped vars: see http://pic.dhe.ibm.com/infocenter/ntz/v7r0m3/index.jsp?topic=%2Fcom.ibm.nz.dbu.doc%2Fc_dbuser_hypothetical_set_family_syntax.html So I'm going to say that the majority opinion is on my side here. Tom> So that's all fine, but the patch seems a bit confused about it. The patch treats vars in the direct args exactly as it would treat them anywhere else where they must be ungrouped. [snip a bunch of stuff based on false assumptions] Tom> What I now think we should do about the added pg_aggregate Tom> columns is to have: Tom> aggnfixedargs int number of fixed arguments, excluding any Tom> hypothetical ones (always 0 for normal aggregates) Tom> aggkind char'n' for normal aggregate, Tom> 'o' for ordered set function, Tom> 'h' for hypothetical-set function I don't see any obvious problem with this. Tom> with the parsing rules that given Tom> agg( n fixed arguments ) WITHIN GROUP ( ORDER BY k sort specifications ) Tom> 1. WITHIN GROUP is disallowed for normal aggregates. (This is what the submitted patch does.) Tom> 2. For an ordered set function, n must equal aggnfixedargs. We Tom> treat all n fixed arguments as contributing to the aggregate's Tom> result collation, but ignore the sort arguments. That doesn't work for getting a sensible collation out of percentile_disc applied to a collatable type. (Which admittedly is an extension to the spec, which allows only numeric and interval, but it seems to me to be worth having.) Tom> 3. For a hypothetical-set function, n must equal aggnfixedargs Tom> plus k, and we match up type and collation info of the last k Tom> fixed arguments with the corresponding sort arguments. The Tom> first n-k fixed arguments contribute to the aggregate's result Tom> collation, the rest not. The submitted patch does essentially this but taking the number of non-variadic args in place of the suggested aggnfixedargs. Presumably in your version the latter would be derived from the former? Tom> Reading back over this email, I see I've gone back and forth Tom> between using the terms "direct args" and "fixed args" for the Tom> evaluate-once stuff to the left of WITHIN GROUP. I guess I'm Tom> not really sold on either terminology, but we need something we Tom> can use consistently in the code and docs. The spec is no help, Tom> it has no generic term at all for these args. Does anybody else Tom> have a preference, or maybe another suggestion entirely? We (Atri and I) have been using "direct args", but personally I'm not amazingly happy with it. Documentation for other dbs tends to just call them "arguments", and refer to the WITHIN GROUP expressions as "ordering expressions" or similar. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers