Re: [HACKERS] Extension Templates S03E11
On Wed, 2013-12-04 at 14:54 -0500, Tom Lane wrote: I think Stephen has already argued why it could be a good idea here. But in a nutshell: it seems like there are two use-cases to be supported, one where you want CREATE EXTENSION hstore to give you some appropriate version of hstore, and one where you want to restore exactly what you had on the previous installation. It seems to me that exploding the extension by dumping, rather than suppressing, its component objects is by far the most reliable way of accomplishing the latter. The behavior of an extension should not depend on how it was installed. The kind of extension being described by Stephen will: * Not be updatable by doing ALTER EXTENSION foo UPDATE TO '2.0' * Dump out objects that wouldn't be dumped if they had installed the extension using the filesystem So if we do it this way, then we should pick a new name, like package. And then we'll need to decide whether it still makes sense to use an external tool to transform a PGXN extension into a form that could be loaded as a package. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: David Johnston pol...@yahoo.com 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \%s\ due to administrator command 5 and 6: I don't fully understand when they would happen but likely fall into the same the DBA should know what is going on with their server and confirm any startup/shutdown activity it is involved with. They might be better categorized NOTICE level if they were in response to a administrator action, versus in response to a crashed process, but even for the user-initiated situation making sure they hit the log but using FATAL is totally understandable and IMO desirable. #5 is output when the DBA shuts down the replication standby server. #6 is output when the DBA shuts down the server if he is using any custom background worker. These messages are always output. What I'm seeing as a problem is that FATAL messages are output in a normal situation, which worries the DBA, and those messages don't help the DBA with anything. They merely worry the DBA. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: David Johnston pol...@yahoo.com PITR/Failover is not generally that frequent an occurrence but I will agree that these events are likely common during such. Maybe PITR/Failover mode can output something in the logs to alleviate user angst about these frequent events? I'm doubting that a totally separate mechanism can be used for this mode but instead of looking for things to remove how about adding some additional coddling to the logs and the beginning and end of the mode change? Yes, those messages are not output frequently, because they are only output during planned or unplanned downtime. But frequency does not matter here. Imagine the DBA's heart. They need to perform maintenance or recovery operation in a hurry and wish to not encounter any trouble. They would panic if something goes trouble when he expects to see no trouble. The FATAL messages merely make him worried. The extra FATAL messages can be a problem for support staff. What do you feel if the DBA asks you for help when some emergency trouble occurs during recovery, with a few important messages buried in lots of unnecessary FATAL ones? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] pg_ctl fails with config-only directory
From: Amit Kapila amit.kapil...@gmail.com Today, I had again gone through all the discussion that happened at that time related to this problem and I found that later in discussion it was discussed something on lines as your Approach-2, please see the link http://www.postgresql.org/message-id/503a879c.6070...@dunslane.net Thank you again. I'm a bit relieved to see similar discussion. Wouldn't the other way to resolve this problem be reinvoke pg_ctl in non-restricted mode for the case in question? It's effectively the same as not checking admin privileges. And direct invocation of postgres -C/--describe-config still cannot be helped. I think it is important to resolve this problem, so please godhead and upload this patch to next CF. Thanks. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery to backup point
From: Michael Paquier michael.paqu...@gmail.com On Sat, Dec 7, 2013 at 9:06 AM, MauMau maumau...@gmail.com wrote: Recovery target 'immediate' http://www.postgresql.org/message-id/51703751.2020...@vmware.com May I implement this feature and submit a patch for the next commitfest if I have time? Please feel free. I might as well participate in the review. Thanks. I'm feeling incliend to make the configuration recovery_target = 'backup_point' instead of recovery_target = 'immediate', because: * The meaning of this feature for usrs is to recover the database to the backup point. * it doesn't seem to need a new parameter. recovery_target_time sounds appropriate because users want to restore the database at the time of backup. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 1/2] SSL: GUC option to prefer server cipher order
Committed your v2 patch (with default to on). I added a small snippet of documentation explaining that this setting is mainly for backward compatibility. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On Sat, Dec 7, 2013 at 12:27 AM, David Johnston pol...@yahoo.com wrote: 1. FATAL: the database system is starting up How about altering the message to tone down the severity by a half-step... FATAL: (probably) not! - the database system is starting up Well it is fatal, the backend for that client doesn't continue. FATAL means a backend died. It is kind of vague how FATAL and PANIC differ. They both sound like the end of the world. If you read FATAL thinking it means the whole service is quitting -- ie what PANIC means -- then these sound like they're noise. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
MauMau wrote From: David Johnston lt; polobo@ gt; 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \%s\ due to administrator command 5 and 6: I don't fully understand when they would happen but likely fall into the same the DBA should know what is going on with their server and confirm any startup/shutdown activity it is involved with. They might be better categorized NOTICE level if they were in response to a administrator action, versus in response to a crashed process, but even for the user-initiated situation making sure they hit the log but using FATAL is totally understandable and IMO desirable. #5 is output when the DBA shuts down the replication standby server. #6 is output when the DBA shuts down the server if he is using any custom background worker. These messages are always output. What I'm seeing as a problem is that FATAL messages are output in a normal situation, which worries the DBA, and those messages don't help the DBA with anything. They merely worry the DBA. While worry is something to be avoided not logging messages is not the correct solution. On the project side looking for ways and places to better communicate to the user is worthwhile in the absence of adding additional complexity to the logging system. The user/DBA, though, is expected to learn how the proces works, ideally in a testing environment where worry is much less a problem, and understand that some of what they are seeing are client-oriented messages that they should be aware of but that do not indicate server-level issues by themselves. They should probably run the log through a filter (a template of which the project should provide) that takes the raw log and filters out those things that, relative to the role and context, are really better classified as NOTICE even if the log-level is shown as FATAL. Raw logs should be verbose so that tools can have more data with which to make decisions. You can always hide what is provided but you can never show was has been omitted. I get that people wil scan raw logs but targeting that lowest-denominator isn't necessarily the best thing to do. Instead, getting those people on board with better practices and showing (and providing) them helpful tools should be the goal. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782267.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 2013-12-07 13:58:11 +, Greg Stark wrote: FATAL means a backend died. It is kind of vague how FATAL and PANIC differ. I don't really see much vagueness there. FATAL is an unexpected but orderly shutdown. PANIC is for the situations where we can't handle the problem that occurred in any orderly way. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On Sat, Dec 7, 2013 at 3:56 PM, Andres Freund and...@2ndquadrant.com wrote: I don't really see much vagueness there. FATAL is an unexpected but orderly shutdown. PANIC is for the situations where we can't handle the problem that occurred in any orderly way. Sorry, I was unclear. I meant that without context if someone asked you which was more severe, fatal or panic you would have no particular way to know. After all, for a person it's easier to cure a panic than a fatality :) On the client end the FATAL is pretty logical but in the logs it makes it sound like the entire server died. Especially in this day of multithreaded servers. I was suggesting that that was the origin of the confusion here. Anyone who has seen these messages on the client end many times might interpret them correctly in the server logs but someone who has only been a DBA, not a database user might never have seen them except in the server logs and without the context might not realize that FATAL is a term of art peculiar to Postgres. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
On Sat, Dec 7, 2013 at 6:31 AM, Peter Geoghegan p...@heroku.com wrote: On Fri, Dec 6, 2013 at 12:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: There seems to be no problem even if we use bigint as the type of unsigned 32-bit integer like queryid. For example, txid_current() returns the transaction ID, i.e., unsigned 32-bit integer, as bigint. Could you tell me what the problem is when using bigint for queryid? We're talking about the output of some view, right, not internal storage? +1 for using bigint for that. Using OID is definitely an abuse, because the value *isn't* an OID. And besides, what if we someday decide we need 64-bit keys not 32-bit? Fair enough. I was concerned about the cost of external storage of 64-bit integers (unlike query text, they might have to be stored many times for many distinct intervals or something like that), but in hindsight that was fairly miserly of me. Attached revision displays signed 64-bit integers instead. Thanks! Looks good to me. Committed! Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
* Jeff Davis (pg...@j-davis.com) wrote: On Wed, 2013-12-04 at 14:54 -0500, Tom Lane wrote: I think Stephen has already argued why it could be a good idea here. But in a nutshell: it seems like there are two use-cases to be supported, one where you want CREATE EXTENSION hstore to give you some appropriate version of hstore, and one where you want to restore exactly what you had on the previous installation. It seems to me that exploding the extension by dumping, rather than suppressing, its component objects is by far the most reliable way of accomplishing the latter. The behavior of an extension should not depend on how it was installed. The kind of extension being described by Stephen will: * Not be updatable by doing ALTER EXTENSION foo UPDATE TO '2.0' That's correct, but consider when the above command actually works today: when the new version is magically made available to the backend without any action happening in PG. That works when the filesystem can be updated independently or the backend can reach out to some other place and pull things down but that's, really, a pretty special situation. It works today specifically because we expect the OS packaging system to make changes to the filesystem for us but this whole 'extension template' proposal is about getting away from the filesystem. Instead, I'm suggesting an external tool which can pull down the new version from an external repo and then apply it to the backend. Clearly, my approach supports the general action of updating an extension, it just doesn't expect the filesystem on the server to be changed underneath PG nor that PG will reach out to some external repository on the basis of a non-superuser request to get the update script. * Dump out objects that wouldn't be dumped if they had installed the extension using the filesystem Correct, but the general presumption here is that many of these extensions wouldn't even be available for installation on the filesystem anyway. So if we do it this way, then we should pick a new name, like package. I've been on the fence about this for a while. There's definitely pros and cons to consider but it would go very much against one of the goals here, which is to avoid asking extension authors (or their users, to some extent..) to change and I expect it'd also be a lot more work to invent something which is 90% the same as extensions. Perhaps there's no help for it and we'll need extensions which are essentially for OS managed extensions (which can include .so files and friends) and then packages for entirely-in-catalog sets of objects with a certain amount of metadata included (version and the like). And then we'll need to decide whether it still makes sense to use an external tool to transform a PGXN extension into a form that could be loaded as a package. I'd certainly think it would be but if we're moving away from calling them extensions then I'm afraid extension authors and users would end up having to change something anyway, no matter the tool. Perhaps that's reasonable and we can at least minimize the impact but much of what extensions offer are, in fact, what's also needed for packages and I'm not thrilled with the apparent duplication. It just occured to me that perhaps we can call these something different towards the end user but use the existing catalogs and code for extensions to handle our representation, with a few minor tweaks.. Not sure if I like that idea, but it's a thought. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core
On Sun, Nov 17, 2013 at 1:15 PM, Peter Geoghegan p...@heroku.com wrote: On Sat, Nov 16, 2013 at 4:36 PM, Peter Geoghegan p...@heroku.com wrote: I'll post a revision with fixes for those. Another concern is that I see some race conditions that only occur when pg_stat_statements cache is under a lot of pressure with a lot of concurrency, that wasn't revealed by my original testing. I'm working on a fix for that. Revision attached. The patch doesn't apply cleanly against the master due to recent change of pg_stat_statements. Could you update the patch? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
On Wed, Nov 20, 2013 at 3:18 AM, Atri Sharma atri.j...@gmail.com wrote: On Tue, Nov 19, 2013 at 11:43 PM, Mike Blackwell mike.blackw...@rrd.com wrote: This patch looks good to me. It applies, builds, and runs the regression tests. Documentation is included and it seems to do what it says. I don't consider myself a code expert, but as far as I can see it looks fine. This is a pretty straightforward enhancement to the existing pg_stat_* code. If no one has any objections, I'll mark it ready for committer. Mike I agree. I had a discussion with Mike yesterday, and took the performance areas in the patch. I think the impact would be pretty low and since the global counter being incremented is incremented with keeping race conditions in mind, I think that the statistics collected will be valid. So, I have no objections to the patch being marked as ready for committer. Could you share the performance numbers? I'm really concerned about the performance overhead caused by this patch. Here are the comments from me: All the restrictions of this feature should be documented. For example, this feature doesn't track the bytes of the data transferred by FDW. It's worth documenting that kind of information. ISTM that this feature doesn't support SSL case. Why not? The amount of data transferred by walreceiver also should be tracked, I think. I just wonder how conn_received, conn_backend and conn_walsender are useful. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
Sent from my iPad On 07-Dec-2013, at 23:47, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Nov 20, 2013 at 3:18 AM, Atri Sharma atri.j...@gmail.com wrote: On Tue, Nov 19, 2013 at 11:43 PM, Mike Blackwell mike.blackw...@rrd.com wrote: This patch looks good to me. It applies, builds, and runs the regression tests. Documentation is included and it seems to do what it says. I don't consider myself a code expert, but as far as I can see it looks fine. This is a pretty straightforward enhancement to the existing pg_stat_* code. If no one has any objections, I'll mark it ready for committer. Mike I agree. I had a discussion with Mike yesterday, and took the performance areas in the patch. I think the impact would be pretty low and since the global counter being incremented is incremented with keeping race conditions in mind, I think that the statistics collected will be valid. So, I have no objections to the patch being marked as ready for committer. Could you share the performance numbers? I'm really concerned about the performance overhead caused by this patch. I did some pgbench tests specifically with increasing number of clients, as that are the kind of workloads that can lead to display in slowness due to increase in work in the commonly used functions. Let me see if I can get the numbers and see where I kept them. Here are the comments from me: All the restrictions of this feature should be documented. For example, this feature doesn't track the bytes of the data transferred by FDW. It's worth documenting that kind of information. +1 ISTM that this feature doesn't support SSL case. Why not? The amount of data transferred by walreceiver also should be tracked, I think. Yes, I agree. WAL receiver data transfer can be problematic some times as well, so should be tracked. Regards, Atri -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Index overhead cost reporting
Hi, I was wondering whether anyone has any insight with regards to measuring and reporting the overhead of maintaining indexes on relations. If an UPDATE is issued to a table with, say, 6 indexes, it would be useful to determine how much time is spent updating each of those indexes. And perhaps such timings would not be confined to indexes, but also other dependants that add overhead, such as triggers, rules, and in future, eager incremental materialised view updates. One use case I had in mind is observing whether any index is particularly burdensome to the overall plan. Then an analysis of those numbers could show that, for example, 25% of the time spent on DML on table my_table were spent on maintaining index idx_my_table_a... and index that's not often used. Is that something that could be provided in an EXPLAIN ANALYSE node? Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index overhead cost reporting
Thom Brown t...@linux.com writes: I was wondering whether anyone has any insight with regards to measuring and reporting the overhead of maintaining indexes on relations. If an UPDATE is issued to a table with, say, 6 indexes, it would be useful to determine how much time is spent updating each of those indexes. And perhaps such timings would not be confined to indexes, but also other dependants that add overhead, such as triggers, rules, and in future, eager incremental materialised view updates. We already do track the time spent in triggers. Although Kevin might have a different idea, I'd think that matview updates should also be driven by triggers, so that the last item there would be covered. Is that something that could be provided in an EXPLAIN ANALYSE node? Well, it'd not be particularly difficult to add measurements of the time spent in ExecInsertIndexTuples, but I'd have some concerns about that: * Instrumentation overhead. That's already painful on machines with slow gettimeofday, and this has the potential to add a lot more, especially with the more expansive readings of your proposal. * Is it really measuring the right thing? To a much greater degree than for some other things you might try to measure, just counting time spent in ExecInsertIndexTuples is going to understate the true cost of updating an index, because so much of the true cost is paid asynchronously; viz, writing WAL as well as the actual index pages. We already have that issue with measuring the runtime of a ModifyTable node as a whole, but slicing and dicing that time ten ways would make the results even more untrustworthy, IMO. * There are also other costs to think of, such as the time spent by VACUUM on maintaining the index, and the time spent by the planner considering (perhaps vainly) whether it can use the index for each query that reads the table. In principle you could instrument VACUUM to track the time it spends updating each index, and log that in the pgstats infrastructure. (I'd even think that might be a good idea, except for the bloat effect on the pgstats files.) I'm not at all sure there's any practical way to measure the distributed planner overhead; it's not paid in discrete chunks large enough to be timed easily. Perhaps it's small enough to ignore, but I'm not sure. Bottom line, I think it's a harder problem than it might seem at first glance. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE sampling is too good
On Tue, Dec 3, 2013 at 6:30 PM, Greg Stark st...@mit.edu wrote: I always gave the party line that ANALYZE only takes a small constant-sized sample so even very large tables should be very quick. But after hearing the same story again in Heroku I looked into it a bit further. I was kind of shocked but the numbers. ANALYZE takes a sample of 300 * statistics_target rows. That sounds pretty reasonable but with default_statistics_target set to 100 that's 30,000 rows. If I'm reading the code right It takes this sample by sampling 30,000 blocks and then (if the table is large enough) taking an average of one row per block. Each block is 8192 bytes so that means it's reading 240MB of each table.That's a lot more than I realized. That is a lot. On the other hand, I question the subject line: sometimes, our ANALYZE sampling is not good enough. Before we raised the default statistics target from 10 to 100, complaints about bad plan choices due to insufficiently-precise statistics were legion -- and we still have people periodically proposing to sample a fixed percentage of the table instead of a fixed amount of the table, even on large tables, which is going the opposite direction. I think this is because they're getting really bad n_distinct estimates, and no fixed-size sample can reliably give a good one. More generally, I think the basic problem that people are trying to solve by raising the statistics target is avoid index scans on gigantic tables. Obviously, there are a lot of other situations where inadequate statistics can cause problems, but that's a pretty easy-to-understand one that we do not always get right. We know that an equality search looking for some_column = 'some_constant', where some_constant is an MCV, must be more selective than a search for the least-frequent MCV. If you store more and more MCVs for a table, eventually you'll have enough that the least-frequent one is pretty infrequent, and then things work a lot better. This is more of a problem for big tables than for small tables. MCV #100 can't have a frequency of greater than 1/100 = 0.01, but that's a lot more rows on a big table than small one. On a table with 10 million rows we might estimate something close to 100,000 rows when the real number is just a handful; when the table has only 10,000 rows, we just can't be off by as many orders of magnitude. Things don't always work out that badly, but in the worst case they do. Maybe there's some highly-principled statistical approach which could be taken here, and if so that's fine, but I suspect not. So what I think we should do is auto-tune the statistics target based on the table size. If, say, we think that the generally useful range for the statistics target is something like 10 to 400, then let's come up with a formula based on table size that outputs 10 for small tables, 400 for really big tables, and intermediate values for tables in the middle. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan
On Fri, Dec 6, 2013 at 5:02 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: Though at first I agreed on this, while working on this I start to think information about (2) is enough for tuning work_mem. Here are examples using a version under development, where Bitmap Memory Usage means (peak) memory space used by a TIDBitmap, and Desired means the memory required to guarantee non-lossy storage of a TID set, which is shown only when the TIDBitmap has been lossified. (work_mem = 1MB.) I'd be wary of showing a desired value unless it's highly likely to be accurate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared memory message queues
On Fri, Dec 6, 2013 at 8:12 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-12-05 14:07:39 -0500, Robert Haas wrote: On Thu, Dec 5, 2013 at 8:29 AM, Andres Freund and...@2ndquadrant.com wrote: Hm. The API change of on_shmem_exit() is going to cause a fair bit of pain. There are a fair number of extensions out there using it and all would need to be littered by version dependent #ifdefs. What about providing a version of on_shmem_exit() that allows to specify the exit phase, and make on_shmem_exit() a backward compatible wrapper? Or, we could have on_user_exit() or so and leave on_shmem_exit() alone altogether, which would probably be transparent for nearly everyone. I kind of like that better, except that I'm not sure whether on_user_exit() is any good as a name. Leaving it as separate calls sounds good to me as well - but like you I don't like on_user_exit() that much. Not sure if I can up with something really good... on_shmem_exit_phase() or at_shmem_exit_phase() if we go for a function allowing to specify phases, and just before_shmem_exit() for the user level things? Hmm, before_shmem_exit() and on_shmem_exit() doesn't sound bad. Hm. A couple of questions/comments: * How do you imagine keys to be used/built? * Won't the sequential search over toc entries become problematic? * Some high level overview of how it's supposed to be used wouldn't hurt. * the estimator stuff doesn't seem to belong in the public header? The test-shm-mq patch is intended to illustrate these points. In that case, for an N-worker configuration, there are N+1 keys; that is, N message queues and one fixed-size control area. The estimator stuff is critically needed in the public header so that you can size your DSM to hold the stuff you intend to store in it; again, see test-shm-mq. I still think shm_mq.c needs to explain more of that. That's where I look for a brief high-level explanation, no? That stuff mostly pertains to shm_toc.c, not shm_mq.c. I can certainly look at expanding the explanation in the former. I had a very quick look at shm_mq_receive()/send() and shm_mq_receive_bytes()/shm_mq_send_bytes() - while the low level steps seem to be explained fairly well, the high level design of the queue doesn't seem to be explained much. I was first thinking that you were trying to implement a lockless queue (which is quite efficiently possible for 1:1 queues) even because I didn't see any locking in them till I noticed it's just delegated to helper functions. I did initially implement it as lockless, but I figured I needed a fallback with spinlocks for machines that didn't have atomic 8-bit writes and/or memory barriers, both of which are required for this to work without locks. When I implemented the fallback, I discovered that it wasn't any slower than the lock-free implementation, so I decided to simplify my life (and the code) by not bothering with it. So the basic idea is that you have to take the spinlock to modify the count of bytes read - if you're the reader - or written - if you're the writer. You also need to take the spinlock to read the counter the other process might be modifying, but you can read the counter that only you can modify without the lock. You also don't need the lock to read or write the data in the buffer, because you can only write to the portion of the buffer that the reader isn't currently allowed to read, and you can only read from the portion of the buffer that the writer isn't currently allowed to write. I wonder if we couldn't just generally store a generation number for each PGPROC which is increased everytime the slot is getting reused. Then one could simply check whether mq-mq_sender-generation == mq-mq_sender_generation. I think you'd want to bump the generation every time the slot was released rather than when it was reacquired, but it does sound possibly sensible. However, it wouldn't obviate the separate need to watch a BackgroundWorkerHandle, because what this lets you do is (say) write to a queue after you've registered the reader but before it's actually been started by the postmaster, let alone acquired a PGPROC. I won't object if someone implements such a system for their needs, but it's not on my critical path. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index overhead cost reporting
On 7 December 2013 19:41, Tom Lane t...@sss.pgh.pa.us wrote: Thom Brown t...@linux.com writes: I was wondering whether anyone has any insight with regards to measuring and reporting the overhead of maintaining indexes on relations. If an UPDATE is issued to a table with, say, 6 indexes, it would be useful to determine how much time is spent updating each of those indexes. And perhaps such timings would not be confined to indexes, but also other dependants that add overhead, such as triggers, rules, and in future, eager incremental materialised view updates. We already do track the time spent in triggers. Although Kevin might have a different idea, I'd think that matview updates should also be driven by triggers, so that the last item there would be covered. Oh, of course. :) Is that something that could be provided in an EXPLAIN ANALYSE node? Well, it'd not be particularly difficult to add measurements of the time spent in ExecInsertIndexTuples, but I'd have some concerns about that: * Instrumentation overhead. That's already painful on machines with slow gettimeofday, and this has the potential to add a lot more, especially with the more expansive readings of your proposal. * Is it really measuring the right thing? To a much greater degree than for some other things you might try to measure, just counting time spent in ExecInsertIndexTuples is going to understate the true cost of updating an index, because so much of the true cost is paid asynchronously; viz, writing WAL as well as the actual index pages. We already have that issue with measuring the runtime of a ModifyTable node as a whole, but slicing and dicing that time ten ways would make the results even more untrustworthy, IMO. * There are also other costs to think of, such as the time spent by VACUUM on maintaining the index, and the time spent by the planner considering (perhaps vainly) whether it can use the index for each query that reads the table. In principle you could instrument VACUUM to track the time it spends updating each index, and log that in the pgstats infrastructure. (I'd even think that might be a good idea, except for the bloat effect on the pgstats files.) I'm not at all sure there's any practical way to measure the distributed planner overhead; it's not paid in discrete chunks large enough to be timed easily. Perhaps it's small enough to ignore, but I'm not sure. Bottom line, I think it's a harder problem than it might seem at first glance. Thanks for taking the time to explain the above. Perhaps I may have misunderstood, or not explained my question with enough detail, but you appear to be including activity that would, in all likelihood, occur after the DML has returned confirmation to the user that it has completed; in particular, VACUUM. What I was thinking of was an execution plan node to communicate the index modifications that are carried out prior to confirmation of the query completing. The bgwriter, WAL writer et al. that spring into action as a result of the index being updated wouldn't, as I see it, be included. So in essence, I'd only be looking for a breakdown of anything that adds to the duration of the DML statement. However, it sounds like even that isn't straightforward from what you've written. Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 2/2] SSL: Support ECDH key excange.
On Thu, 2013-11-07 at 01:59 +0200, Marko Kreen wrote: This sets up ECDH key exchange, when compiling against OpenSSL that supports EC. Then ECDHE-RSA and ECDHE-ECDSA ciphersuites can be used for SSL connections. Latter one means that EC keys are now usable. Committed v2. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql_check_function - rebase for 9.3
On 08/22/2013 02:02 AM, Pavel Stehule wrote: rebased Regards Pavel This patch again needs a rebase, it no longer applies cleanly. plpgsql_estate_setup now takes a shared estate parameter and the call in pl_check needs that. I passed NULL in and things seem to work. I think this is another example where are processes aren't working as we'd like. The last time this patch got a review was in March when Tom said http://www.postgresql.org/message-id/27661.1364267...@sss.pgh.pa.us Shortly after Pavel responded with a new patch that changes the output format. http://www.postgresql.org/message-id/CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_jjop21fj...@mail.gmail.com I don't much has progress in the following 9 months on this patch. In Tom's review the issue of code duplication and asks: do we want to accept that this is the implementation pathway we're going to settle for, or are we going to hold out for a better one? I'd be the first in line to hold out if I had a clue how to move towards a better implementation, but I have none. This question is still outstanding. Here are my comments on this version of the patch: This version of the patch gives output as a table with the structure: functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context ++---+--+-++--+---+--+---+- I like this much better than the previous format. I think the patch needs more documentation. The extent of the documentation is titleChecking of embedded SQL/title +para + The SQL statements inside applicationPL/pgSQL/ functions are + checked by validator for semantic errors. These errors + can be found by functionplpgsql_check_function/function: I a don't think that this adequately describes the function. It isn't clear to me what we mean by 'embedded' SQL. and 'semantic errors'. I think this would read better as sect2 id=plpgsql-check-function titleChecking SQL Inside Functions/title para The SQL Statements inside of applicationPL/pgsql/ functions can be checked by a validation function for semantic errors. You can check applicationPL/pgsl/application functions by passing the name of the function to functionplpgsql_check_function/function. /para para The functionplpgsql_check_function/function will check the SQL inside of a applicationPL/pgsql/application function for certain types of errors and warnings. /para but that needs to be expanded on. I was expecting something like: create function x() returns void as $$ declare a int4; begin a='f'; end $$ language plpgsql; to return an error but it doesn't but create temp table x(a int4); create or replace function x() returns void as $$ declare a int4; begin insert into x values('h'); end $$ language plpgsql; does give an error when I pass it to the validator. Is this the intended behavior of the patch? If so we need to explain why the first function doesn't generate an error. I feel we need to document what each of the columns in the output means. What is the difference between statement, query and context? Some random comments on the messages you return: Line 816: if (is_expression tupdesc-natts != 1) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg(qw, query-query, tupdesc-natts))); Should this be query \%s\ returned %d columns but only 1 is allowed or something similar? Line 837 else /* XXX: should be a warning? */ ereport(ERROR, (errmsg(cannot to identify real type for record type variable))); In what situation does this come up? Should it be a warning or error? Do we mean the real type for record variable ? Line 1000 ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg(function does not return composite type, is not possible to identify composite type))); Should this be function does not return a composite type, it is not possible to identify the composite type ? Line 1109: appendStringInfo(str, parameter \%s\ is overlapped, + refname); gives warnings like: select message,detail FROM plpgsql_check_function('b(int)'); message | detail -+ parameter a is overlapped | Local variable overlap function parameter. How about instead parameter a is duplicated | Local variable is named the same as the function parameter Line 1278: + checker_error(cstate, +
Re: [HACKERS] ANALYZE sampling is too good
On 12/07/2013 11:46 AM, Robert Haas wrote: Maybe there's some highly-principled statistical approach which could be taken here, and if so that's fine, but I suspect not. So what I think we should do is auto-tune the statistics target based on the table size. If, say, we think that the generally useful range for the statistics target is something like 10 to 400, then let's come up with a formula based on table size that outputs 10 for small tables, 400 for really big tables, and intermediate values for tables in the middle. The only approach which makes sense is to base it on a % of the table. In fact, pretty much every paper which has examined statistics estimation for database tables has determined that any estimate based on a less-than-5% sample is going to be wildly inaccurate. Not that 5% samples are 100% accurate, but at least they fit the 80/20 rule. This is the reason why implementing block-based sampling is critical; using our current take one row out of every page method, sampling 5% of the table means scanning the whole thing in most tables. We also need to decouple the number of MCVs we keep from the sample size. Certainly our existing sampling algo seems designed to maximize IO for the sample size. There's other qualitative improvements we could make, which Nathan Boley has spoken on. For example, our stats code has no way to recognize a normal or exponential distrbution -- it assumes that all columns are randomly distributed. If we could recoginze common distribution patterns, then not only could we have better query estimates, those would require keeping *fewer* stats, since all you need for a normal distribution are the end points and the variance. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth and...@tao11.riddles.org.uk writes: Tom == Tom Lane t...@sss.pgh.pa.us writes: Tom I believe that the spec requires that the direct arguments of Tom an inverse or hypothetical-set aggregate must not contain any Tom Vars of the current query level. False. After examining this more closely, ISTM that the direct arguments are supposed to be processed as if they weren't inside an aggregate call at all. That being the case, isn't it flat out wrong for check_agg_arguments() to be examining the agg_ordset list? It should ignore those expressions whilst determining the aggregate's semantic level. As an example, an upper-level Var in those expressions isn't grounds for deciding that the aggregate isn't of the current query level. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index overhead cost reporting
Thom Brown t...@linux.com writes: Perhaps I may have misunderstood, or not explained my question with enough detail, but you appear to be including activity that would, in all likelihood, occur after the DML has returned confirmation to the user that it has completed; in particular, VACUUM. What I was thinking of was an execution plan node to communicate the index modifications that are carried out prior to confirmation of the query completing. The bgwriter, WAL writer et al. that spring into action as a result of the index being updated wouldn't, as I see it, be included. So in essence, I'd only be looking for a breakdown of anything that adds to the duration of the DML statement. However, it sounds like even that isn't straightforward from what you've written. I think that would be reasonably straightforward, though perhaps too expensive depending on the speed of clock reading. My larger point was that I don't think that that alone is a fair measure of the cost of maintaining an index, which is what you had claimed to be interested in. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index overhead cost reporting
On 7 December 2013 20:44, Tom Lane t...@sss.pgh.pa.us wrote: Thom Brown t...@linux.com writes: So in essence, I'd only be looking for a breakdown of anything that adds to the duration of the DML statement. However, it sounds like even that isn't straightforward from what you've written. I think that would be reasonably straightforward, though perhaps too expensive depending on the speed of clock reading. My larger point was that I don't think that that alone is a fair measure of the cost of maintaining an index, which is what you had claimed to be interested in. Point taken. Yes, I don't believe I was that clear. This was a how much of my DML statement time was spent updating indexes rather than a give me stats on index maintenance times for these indexes. It's a shame that it comes at a non-trivial price. Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE sampling is too good
On Sat, Dec 7, 2013 at 8:25 PM, Josh Berkus j...@agliodbs.com wrote: The only approach which makes sense is to base it on a % of the table. In fact, pretty much every paper which has examined statistics estimation for database tables has determined that any estimate based on a less-than-5% sample is going to be wildly inaccurate. Not that 5% samples are 100% accurate, but at least they fit the 80/20 rule. This is nonsense. The math behind the deductions the planner makes is well understood and we know how to estimate the precision based on the sample size. There are some questions that really need a proportional sample such as n_distinct (and as Robert points out the MCV list) but the main motivation of the sample size is the histograms and those are used to answer questions which very clearly do not need a proportional sample. The statistics is very clear there. Using a proportional sample for the histograms would just be silly. It would be substituting a gut feeling for real statistics. The problems with using a proportional sample for things like n_distinct or the MCV list is that you're talking about sorting or hashing an unboundedly large set of values and storing an unboundedly large array in the statistics table. I don't think that would be tenable without dramatically changing the way process and store that data to be more scalable. Using a lossy counting algorithm and something more scalable than toasted arrays would be prerequisites I think. And as Robert mentions even if we solved those problems it wouldn't help n_distinct. You really need to read all the values to deal with n_distinct. This is the reason why implementing block-based sampling is critical; using our current take one row out of every page method, sampling 5% of the table means scanning the whole thing in most tables. We also need to decouple the number of MCVs we keep from the sample size. Certainly our existing sampling algo seems designed to maximize IO for the sample size. This would be helpful if you could specify what you mean by black-based sampling. The reason these previous pleas didn't go anywhere is not because we can't do math. It's because of the lack of specifics here. What we do *today* is called block-based sampling by the literature. What I'm saying is we need to change the block-based sampling that we do to extract more rows per block. We currently extract the correct number of rows to get a strong guarantee of uniformity. If we could get a weaker guarantee of being close enough to uniform samples for the deductions we want to make and get many more rows per block then we could read a lot fewer blocks. Or to put it another way people could increase the statistics target dramatically and still be reading the same number of blocks as today. In an ideal world perhaps we could have people reading 1% of the blocks they read today and get statistics targets 10x better than today. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Tom == Tom Lane t...@sss.pgh.pa.us writes: Tom After examining this more closely, ISTM that the direct Tom arguments are supposed to be processed as if they weren't inside Tom an aggregate call at all. That being the case, isn't it flat Tom out wrong for check_agg_arguments() to be examining the Tom agg_ordset list? It should ignore those expressions whilst Tom determining the aggregate's semantic level. As an example, an Tom upper-level Var in those expressions isn't grounds for deciding Tom that the aggregate isn't of the current query level. Hmm... yes, you're probably right; but we'd still have to check somewhere for improper nesting, no? since not even the direct args are allowed to contain nested aggregate calls. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql_check_function - rebase for 9.3
Hello thank you for review 2013/12/7 Steve Singer st...@ssinger.info On 08/22/2013 02:02 AM, Pavel Stehule wrote: rebased Regards Pavel This patch again needs a rebase, it no longer applies cleanly. plpgsql_estate_setup now takes a shared estate parameter and the call in pl_check needs that. I passed NULL in and things seem to work. I think this is another example where are processes aren't working as we'd like. The last time this patch got a review was in March when Tom said http://www.postgresql.org/message-id/27661.1364267...@sss.pgh.pa.us Shortly after Pavel responded with a new patch that changes the output format. http://www.postgresql.org/message-id/ CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_jjop21fj...@mail.gmail.com I don't much has progress in the following 9 months on this patch. In Tom's review the issue of code duplication and asks: do we want to accept that this is the implementation pathway we're going to settle for, or are we going to hold out for a better one? I'd be the first in line to hold out if I had a clue how to move towards a better implementation, but I have none. yes, I am waiting still to a) have some better idea, or b) have significantly more free time for larger plpgsql refactoring. Nothing of it happens :( This question is still outstanding. Here are my comments on this version of the patch: This version of the patch gives output as a table with the structure: functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context ++---+--+-+- ---+--+---+--+---+- I like this much better than the previous format. I think the patch needs more documentation. The extent of the documentation is titleChecking of embedded SQL/title +para + The SQL statements inside applicationPL/pgSQL/ functions are + checked by validator for semantic errors. These errors + can be found by functionplpgsql_check_function/function: I a don't think that this adequately describes the function. It isn't clear to me what we mean by 'embedded' SQL. and 'semantic errors'. I think this would read better as sect2 id=plpgsql-check-function titleChecking SQL Inside Functions/title para The SQL Statements inside of applicationPL/pgsql/ functions can be checked by a validation function for semantic errors. You can check applicationPL/pgsl/application functions by passing the name of the function to functionplpgsql_check_function/function. /para para The functionplpgsql_check_function/function will check the SQL inside of a applicationPL/pgsql/application function for certain types of errors and warnings. /para but that needs to be expanded on. I was expecting something like: create function x() returns void as $$ declare a int4; begin a='f'; end $$ language plpgsql; to return an error but it doesn't This is expected. PL/pgSQL use a late casting - so a := '10'; is acceptable. And in this moment plpgsql check doesn't evaluate constant and doesn't try to cast to target type. But this check can be implemented sometime. In this moment, we have no simple way how to identify constant expression in plpgsql. So any constants are expressions from plpgsql perspective. but create temp table x(a int4); create or replace function x() returns void as $$ declare a int4; begin insert into x values('h'); end $$ language plpgsql; it is expected too. SQL doesn't use late casting - casting is controlled by available casts and constants are evaluated early - due possible optimization. does give an error when I pass it to the validator. Is this the intended behavior of the patch? If so we need to explain why the first function doesn't generate an error. I feel we need to document what each of the columns in the output means. What is the difference between statement, query and context? Some random comments on the messages you return: Line 816: if (is_expression tupdesc-natts != 1) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg(qw, query-query, tupdesc-natts))); Should this be query \%s\ returned %d columns but only 1 is allowed or something similar? Line 837 else /* XXX: should be a warning? */ ereport(ERROR, (errmsg(cannot to identify real type for record type variable))); In what situation does this come up? Should it be a warning or error? Do we mean the real type for record variable ? a most difficult part of plpgsql check function is about transformation dynamic types to know row
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth and...@tao11.riddles.org.uk writes: Tom == Tom Lane t...@sss.pgh.pa.us writes: Tom After examining this more closely, ISTM that the direct Tom arguments are supposed to be processed as if they weren't inside Tom an aggregate call at all. That being the case, isn't it flat Tom out wrong for check_agg_arguments() to be examining the Tom agg_ordset list? It should ignore those expressions whilst Tom determining the aggregate's semantic level. As an example, an Tom upper-level Var in those expressions isn't grounds for deciding Tom that the aggregate isn't of the current query level. Hmm... yes, you're probably right; but we'd still have to check somewhere for improper nesting, no? since not even the direct args are allowed to contain nested aggregate calls. Yeah, I had come to that same conclusion while making the changes; actually, check_agg_arguments needs to look for aggs but not vars there. In principle we could probably support aggs there if we really wanted to; but it would result in evaluation-order dependencies among the aggs of a single query level, which doesn't seem like something we want to take on for a feature that's not required by spec. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
On Sat, 2013-12-07 at 12:27 -0500, Stephen Frost wrote: * Jeff Davis (pg...@j-davis.com) wrote: The behavior of an extension should not depend on how it was installed. The kind of extension being described by Stephen will: * Not be updatable by doing ALTER EXTENSION foo UPDATE TO '2.0' ... [ reason ] ... * Dump out objects that wouldn't be dumped if they had installed the extension using the filesystem ... [ reason ] ... I understand there are reasons, but I'm having a hard time getting past the idea that I have extension foo v1.2 now needs to be qualified with installed using SQL or installed using the filesystem to know what you actually have and how it will behave. Stepping back, maybe we need to do some more research on existing SQL-only extensions. We might be over-thinking this. How many extensions are really just a collection of functions on existing types? If you define a new data type, almost all of the functions seem to revolve around C code -- not just to define the basic data type, but also the GiST support routines, which then mean you're implementing operators in C too, etc. Perhaps we should first focus on making SQL-only extensions more useful? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE sampling is too good
On 08/12/13 09:25, Josh Berkus wrote: On 12/07/2013 11:46 AM, Robert Haas wrote: Maybe there's some highly-principled statistical approach which could be taken here, and if so that's fine, but I suspect not. So what I think we should do is auto-tune the statistics target based on the table size. If, say, we think that the generally useful range for the statistics target is something like 10 to 400, then let's come up with a formula based on table size that outputs 10 for small tables, 400 for really big tables, and intermediate values for tables in the middle. The only approach which makes sense is to base it on a % of the table. In fact, pretty much every paper which has examined statistics estimation for database tables has determined that any estimate based on a less-than-5% sample is going to be wildly inaccurate. Not that 5% samples are 100% accurate, but at least they fit the 80/20 rule. This is the reason why implementing block-based sampling is critical; using our current take one row out of every page method, sampling 5% of the table means scanning the whole thing in most tables. We also need to decouple the number of MCVs we keep from the sample size. Certainly our existing sampling algo seems designed to maximize IO for the sample size. There's other qualitative improvements we could make, which Nathan Boley has spoken on. For example, our stats code has no way to recognize a normal or exponential distrbution -- it assumes that all columns are randomly distributed. If we could recoginze common distribution patterns, then not only could we have better query estimates, those would require keeping *fewer* stats, since all you need for a normal distribution are the end points and the variance. From src/backend/commands/analyze.c * As of May 2004 we use a new two-stage method: Stage one selects up * to targrows random blocks (or all blocks, if there aren't so many). * Stage two scans these blocks and uses the Vitter algorithm to create * a random sample of targrows rows (or less, if there are less in the * sample of blocks). The two stages are executed simultaneously: each * block is processed as soon as stage one returns its number and while * the rows are read stage two controls which ones are to be inserted * into the sample. I don't think we always read every block (been a while since I looked at this code, so I'll recheck). From what I understand this stuff is based on reasonable research (Vitter algorithm). Not to say it couldn't/shouldn't be looked at again to improve it - but it is not just dreamed up with no valid research! Cheers Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Tom == Tom Lane t...@sss.pgh.pa.us writes: Hmm... yes, you're probably right; but we'd still have to check somewhere for improper nesting, no? since not even the direct args are allowed to contain nested aggregate calls. Tom Yeah, I had come to that same conclusion while making the Tom changes; actually, check_agg_arguments needs to look for aggs Tom but not vars there. There's also the question of ungrouped vars, maybe. Consider these two queries: select array(select a+sum(x) from (values (0.3),(0.7)) v(a) group by a) from generate_series(1,5) g(x); select array(select percentile_disc(a) within group (order by x) from (values (0.3),(0.7)) v(a) group by a) from generate_series(1,5) g(x); In both cases the aggregation query is the outer one; but while the first can return a value, I think the second one has to fail (at least I can't see any reasonable way of executing it). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
On Sun, 2013-12-08 at 02:08 +0900, Fujii Masao wrote: Attached revision displays signed 64-bit integers instead. Thanks! Looks good to me. Committed! 32-bit buildfarm members are having problems with this patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
On Sat, Dec 7, 2013 at 3:50 PM, Peter Eisentraut pete...@gmx.net wrote: 32-bit buildfarm members are having problems with this patch. This should fix that problem. Thanks. -- Peter Geoghegan diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c new file mode 100644 index 4e262b4..9f3e376 *** a/contrib/pg_stat_statements/pg_stat_statements.c --- b/contrib/pg_stat_statements/pg_stat_statements.c *** pg_stat_statements(PG_FUNCTION_ARGS) *** 1160,1165 --- 1160,1166 bool nulls[PG_STAT_STATEMENTS_COLS]; int i = 0; Counters tmp; + int64 queryid = entry-key.queryid; memset(values, 0, sizeof(values)); memset(nulls, 0, sizeof(nulls)); *** pg_stat_statements(PG_FUNCTION_ARGS) *** 1172,1178 char *qstr; if (detected_version = PGSS_V1_2) ! values[i++] = Int64GetDatumFast((int64) entry-key.queryid); qstr = (char *) pg_do_encoding_conversion((unsigned char *) entry-query, --- 1173,1179 char *qstr; if (detected_version = PGSS_V1_2) ! values[i++] = Int64GetDatumFast(queryid); qstr = (char *) pg_do_encoding_conversion((unsigned char *) entry-query, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
* Jeff Davis (pg...@j-davis.com) wrote: I understand there are reasons, but I'm having a hard time getting past the idea that I have extension foo v1.2 now needs to be qualified with installed using SQL or installed using the filesystem to know what you actually have and how it will behave. I can certainly understand that. Stepping back, maybe we need to do some more research on existing SQL-only extensions. We might be over-thinking this. How many extensions are really just a collection of functions on existing types? If you define a new data type, almost all of the functions seem to revolve around C code -- not just to define the basic data type, but also the GiST support routines, which then mean you're implementing operators in C too, etc. Fair point- I'll try and find time to review the 100+ extensions on PGXN and classify them (unless someone else would like to or happens to already know..?). That said, there's certainly cases where people find the existing extension system stinks for their SQL-only code and therefore don't use it- which is exactly the case I'm in @work. We have a ton of pl/pgsql code (along with schema definitions and the like), our own build system which builds both 'full/new' databases based on a certain version *and* will verify that newly built == current version plus a hand-written update script (of course, we have to write that update script) with versioned releases, etc. Point simply being that, were extensions more generally useful, we might see more of them and we need to consider more than just what's in PGXN, and therefore already built as an extension, today. Perhaps we should first focus on making SQL-only extensions more useful? I'm not following what you're suggesting here..? In general, I agree; do you have specific ideas about how to do that? Ones which don't involve a superuser or modifying the filesystem under PG, and which works with replication and backups..? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] ANALYZE sampling is too good
On 08/12/13 12:27, Mark Kirkwood wrote: On 08/12/13 09:25, Josh Berkus wrote: On 12/07/2013 11:46 AM, Robert Haas wrote: Maybe there's some highly-principled statistical approach which could be taken here, and if so that's fine, but I suspect not. So what I think we should do is auto-tune the statistics target based on the table size. If, say, we think that the generally useful range for the statistics target is something like 10 to 400, then let's come up with a formula based on table size that outputs 10 for small tables, 400 for really big tables, and intermediate values for tables in the middle. The only approach which makes sense is to base it on a % of the table. In fact, pretty much every paper which has examined statistics estimation for database tables has determined that any estimate based on a less-than-5% sample is going to be wildly inaccurate. Not that 5% samples are 100% accurate, but at least they fit the 80/20 rule. This is the reason why implementing block-based sampling is critical; using our current take one row out of every page method, sampling 5% of the table means scanning the whole thing in most tables. We also need to decouple the number of MCVs we keep from the sample size. Certainly our existing sampling algo seems designed to maximize IO for the sample size. There's other qualitative improvements we could make, which Nathan Boley has spoken on. For example, our stats code has no way to recognize a normal or exponential distrbution -- it assumes that all columns are randomly distributed. If we could recoginze common distribution patterns, then not only could we have better query estimates, those would require keeping *fewer* stats, since all you need for a normal distribution are the end points and the variance. From src/backend/commands/analyze.c * As of May 2004 we use a new two-stage method: Stage one selects up * to targrows random blocks (or all blocks, if there aren't so many). * Stage two scans these blocks and uses the Vitter algorithm to create * a random sample of targrows rows (or less, if there are less in the * sample of blocks). The two stages are executed simultaneously: each * block is processed as soon as stage one returns its number and while * the rows are read stage two controls which ones are to be inserted * into the sample. I don't think we always read every block (been a while since I looked at this code, so I'll recheck). From what I understand this stuff is based on reasonable research (Vitter algorithm). Not to say it couldn't/shouldn't be looked at again to improve it - but it is not just dreamed up with no valid research! Since I volunteered to recheck :-) from analyze.c again: /* * The following choice of minrows is based on the paper * Random sampling for histogram construction: how much is enough? * by Surajit Chaudhuri, Rajeev Motwani and Vivek Narasayya, in * Proceedings of ACM SIGMOD International Conference on Management * of Data, 1998, Pages 436-447. Their Corollary 1 to Theorem 5 * says that for table size n, histogram size k, maximum relative * error in bin size f, and error probability gamma, the minimum * random sample size is * r = 4 * k * ln(2*n/gamma) / f^2 * Taking f = 0.5, gamma = 0.01, n = 10^6 rows, we obtain * r = 305.82 * k * Note that because of the log function, the dependence on n is * quite weak; even at n = 10^12, a 300*k sample gives = 0.66 * bin size error with probability 0.99. So there's no real need to * scale for n, which is a good thing because we don't necessarily * know it at this point. * */ stats-minrows = 300 * attr-attstattarget; So for typical settings (default_statictics_target = 100), we try to get 3 rows. This means we will sample about 3 blocks. Indeed quickly checking with a scale 100 pgbench db and a simple patch to make the block sampler say how many blocks it reads (note pgbench_accounts has 163935 blocks): bench=# ANALYZE pgbench_branches; NOTICE: acquire sample will need 3 blocks NOTICE: sampled 1 blocks ANALYZE Time: 16.935 ms bench=# ANALYZE pgbench_accounts; NOTICE: acquire sample will need 3 blocks NOTICE: sampled 3 blocks ANALYZE Time: 10059.446 ms bench=# \q Regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add %z support to elog/ereport?
On 2013-12-06 09:54:59 -0500, Peter Eisentraut wrote: On 11/11/13, 12:01 PM, Tom Lane wrote: I do recall Peter saying that the infrastructure knows how to verify conversion specs in translated strings, so it would have to be aware of 'z' flags for this to work. It just checks that the same conversion placeholders appear in the translation. It doesn't interpret the actual placeholders. I don't think this will be a problem. Ok, so here's a patch. Patch 01 introduces infrastructure, and elog.c implementation. Patch 02 converts some elog/ereport() callers to using the z modifier, some were wrong at least on 64 bit windows. In patch 01, I've modified configure to not define [U]INT64_FORMAT directly, but rather just define INT64_LENGTH_MODIFIER as appropriate. The formats are now defined in c.h. INT64_LENGTH_MODIFIER is defined without quotes - I am not sure whether that was the right choice, it requires using CppAsString2() in some places. Maybe I should just have defined it with quotes instead. Happy to rejigger it if somebody thinks the other way would be better. Note that I have decided to only support the z modifier properly if no further modifiers (precision, width) are present. That seems sufficient for now, given the usecases, and more complete support seems to be potentially expensive without much use. I wonder if we should also introduce SIZE_T_FORMAT and SSIZE_T_FORMAT, there's some places that have #ifdef _WIN64 guards and similar that look like they could be simplified. Btw, walsender.c/walreceiver.c upcast an int into an unsigned long for printing? That's a bit strange. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From b31f7c5fc2b2d4451c650a54bb02e656b8fd3bbf Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Sat, 7 Dec 2013 19:23:15 +0100 Subject: [PATCH 1/2] Add support for printing Size arguments to elog()/ereport() using %zu. Similar to the way %m is special-case supported in elog.c routines, add support for the z length modifier by replacing it by the platform's modifier for the pointer size. To do that, refactor the configure checks that define [U]INT64_FORMAT, to instead define the used length modifier as INT64_LENGTH_MODIFIER. Currently we don't support flag, width, precision modifiers in addition to z, but that's fine for the current callsites. --- config/c-library.m4| 36 ++--- configure | 51 +- configure.in | 28 +-- src/backend/utils/error/elog.c | 16 + src/include/c.h| 10 ++--- src/include/pg_config.h.in | 7 ++ src/include/pg_config.h.win32 | 8 ++- 7 files changed, 77 insertions(+), 79 deletions(-) diff --git a/config/c-library.m4 b/config/c-library.m4 index 1e3997b..b836c2a 100644 --- a/config/c-library.m4 +++ b/config/c-library.m4 @@ -221,22 +221,22 @@ HAVE_POSIX_SIGNALS=$pgac_cv_func_posix_signals AC_SUBST(HAVE_POSIX_SIGNALS)])# PGAC_FUNC_POSIX_SIGNALS -# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT +# PGAC_FUNC_SNPRINTF_LONG_LONG_LENGTH_MODIFIER # --- -# Determine which format snprintf uses for long long int. We handle -# %lld, %qd, %I64d. The result is in shell variable -# LONG_LONG_INT_FORMAT. +# Determine which format snprintf uses for long long integers. We +# handle %ll, %q, %I64. The detected length modifer is stored in +# the shell variable LONG_LONG_LENGTH_MODIFIER. # -# MinGW uses '%I64d', though gcc throws an warning with -Wall, -# while '%lld' doesn't generate a warning, but doesn't work. +# MinGW uses '%I64', though gcc throws an warning with -Wall, +# while '%ll' doesn't generate a warning, but doesn't work. # -AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT], -[AC_MSG_CHECKING([snprintf format for long long int]) -AC_CACHE_VAL(pgac_cv_snprintf_long_long_int_format, -[for pgac_format in '%lld' '%qd' '%I64d'; do +AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_LENGTH_MODIFIER], +[AC_MSG_CHECKING([snprintf length modifier for long long]) +AC_CACHE_VAL(pgac_cv_snprintf_long_long_length_modifier, +[for pgac_format in 'll' 'q' 'I64'; do AC_TRY_RUN([#include stdio.h typedef long long int ac_int64; -#define INT64_FORMAT $pgac_format +#define INT64_FORMAT %${pgac_format}d ac_int64 a = 2001; ac_int64 b = 4005; @@ -258,19 +258,19 @@ int does_int64_snprintf_work() main() { exit(! does_int64_snprintf_work()); }], -[pgac_cv_snprintf_long_long_int_format=$pgac_format; break], +[pgac_cv_snprintf_long_long_length_modifier=$pgac_format; break], [], -[pgac_cv_snprintf_long_long_int_format=cross; break]) +[pgac_cv_snprintf_long_long_length_modifier=cross; break]) done])dnl AC_CACHE_VAL -LONG_LONG_INT_FORMAT='' +LONG_LONG_LENGTH_MODIFIER='' -case $pgac_cv_snprintf_long_long_int_format in
Re: [HACKERS] dblink performance regression
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/05/2013 07:05 PM, Joe Conway wrote: On 12/05/2013 06:53 PM, Tom Lane wrote: I seem to remember that at some point we realized that the encoding ID assignments are part of libpq's ABI and so can't practically be changed ever, so the above may be moot. Even so, I think it's a bad idea to be depending on pg_encoding_to_char() here, given the ambiguity in what it references. It would be unsurprising to get build-time or run-time failures on pickier platforms, as a consequence of that ambiguity. So I'd still recommend comparing integer IDs as above, rather than this. Great feedback as always -- thanks! Will make that change. Committed to all active branches. Joe - -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJSo8ZZAAoJEDfy90M199hlCz4P/R9ngR28e3nPNOxpkKO8R2oE t32gPhVP2fLhTstumolJHvkAQh8d+7em8aDT8lwGQ6a1DNGtwlJ2cXR64yjI9SXg Vp4tJyuliZau7kitiDDtGXqbEyM+cCWJ8wFnckToERJtW2uXcC81kqGoac7lz1e9 o34UKizzD8PLA+N6TCG6GsOx8/W2kKEgru8up9jbUzuJDEmzUPkSn8rA2jHVl7oX LKF0hPcFuNJept9A/iNZmBfPgXUrHL/4s9qJ0UQdzzcJDHZ3kalTgRKs9y035WXl XQ/YbYry8/Qw5QgWz9g50/FDhK7jAP/ZBGU99lBGO9e4hxV2lTeeiz177hYzgckF vqWNS/dKN3wtdtXmEwhwmyLodpJKJhLFWOsgfpRPPZZu5Ji+gwPYl3MQHY4THqMp 8kjzK/BL94C15NSCO5E5FBM3CGruYWPVzNZqS4PT+VqCupZ2+qxySgL/0riMbx+J GmBF/C9EGQzrRq7idz8O7/7Frb9TwjBgj+I0wns5NKF1vJpJ2fbzafmvpLkgJ/oI bGhJt43D645BgnAKrInIq6mMbsYAr10pK6v03gJhiJUZREeu9wX5pVIzFV1R1PgC ZS7evjftSP5MwdGs8geEVcQKbx42jW+kQos/HKH6saLKKf5s+cdXNiXaCOj/8cI7 1TZWt31Z9PqS7FuSDkXR =AWVh -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink performance regression
On Sat, Dec 7, 2013 at 11:07 PM, Joe Conway m...@joeconway.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/05/2013 07:05 PM, Joe Conway wrote: On 12/05/2013 06:53 PM, Tom Lane wrote: I seem to remember that at some point we realized that the encoding ID assignments are part of libpq's ABI and so can't practically be changed ever, so the above may be moot. Even so, I think it's a bad idea to be depending on pg_encoding_to_char() here, given the ambiguity in what it references. It would be unsurprising to get build-time or run-time failures on pickier platforms, as a consequence of that ambiguity. So I'd still recommend comparing integer IDs as above, rather than this. Great feedback as always -- thanks! Will make that change. Committed to all active branches. IMHO is more elegant create a procedure to encapsulate the code to avoid redundancy. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] dblink performance regression
On Sun, Dec 8, 2013 at 10:16 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Sat, Dec 7, 2013 at 11:07 PM, Joe Conway m...@joeconway.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/05/2013 07:05 PM, Joe Conway wrote: On 12/05/2013 06:53 PM, Tom Lane wrote: I seem to remember that at some point we realized that the encoding ID assignments are part of libpq's ABI and so can't practically be changed ever, so the above may be moot. Even so, I think it's a bad idea to be depending on pg_encoding_to_char() here, given the ambiguity in what it references. It would be unsurprising to get build-time or run-time failures on pickier platforms, as a consequence of that ambiguity. So I'd still recommend comparing integer IDs as above, rather than this. Great feedback as always -- thanks! Will make that change. Committed to all active branches. IMHO is more elegant create a procedure to encapsulate the code to avoid redundancy. Yep, perhaps something like PQsetClientEncodingIfDifferent or similar would make sense. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink performance regression
All, I tested out Joe's original patch, and it does eliminate the 8% performance regression. Will try the new one. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink performance regression
On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier michael.paqu...@gmail.com wrote: IMHO is more elegant create a procedure to encapsulate the code to avoid redundancy. Yep, perhaps something like PQsetClientEncodingIfDifferent or similar would make sense. Well I think at this first moment we can just create a procedure inside the dblink contrib and not touch in libpq. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] dblink performance regression
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/07/2013 05:41 PM, Fabrízio de Royes Mello wrote: On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier michael.paqu...@gmail.com mailto:michael.paqu...@gmail.com wrote: IMHO is more elegant create a procedure to encapsulate the code to avoid redundancy. Yep, perhaps something like PQsetClientEncodingIfDifferent or similar would make sense. Well I think at this first moment we can just create a procedure inside the dblink contrib and not touch in libpq. Maybe a libpq function could be done for 9.4, but not for back branches. I don't think it makes sense to create a new function in dblink either - -- we're only talking about two lines of added redundancy which is less lines of code than a new function would add. But if we create PQsetClientEncodingIfDifferent() (or whatever) we can remove those extra lines in 9.4 ;-) Joe - -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJSo9BnAAoJEDfy90M199hluqYP/RyoKh9nvC49H3xDl4wBLh4n 0OQ5nKJk8RkQ5d0MLbgn9t1xiQ+RltMcHQQEDoPlrn388DNTqOP31TqCHtI11S5l ZOjZw6eKvcp0KEzk723kZCq9d2N1uRb95K2z5jFUXbeZ2pO6yj8ohdnh8J9i1VQE iI5F76yeUJCO8YRmHMJs39Fy3Qtq0dsXesPYBRuEJqHy7cGh9hpYHPuqHFyW19Kg 0q1nPMa7TYpKRECa1wi+Gt2BJqd70AdnOZZipqqCR2bMJqmcpBy3jo94vedaarAz Wtunn4mk0/2LCNsAjgAdA33FYNfRpgL2c99IQ1DK5hwW9mdH2WH6G+8Eaf5zGhps ZWXJRQSYjfUCmMOoGudEKNX3H3prrNpqltQuC978i4ddKK/78yX1wJD10I8qVNHy MRlSoTFfomVYPW0054Jk6LR1f/RKGD4yuiIPDv8dI/gWINT1HveBGkvSf9wnY578 wjh2iLJ790o4CNK/334ooggPnlbBS4yQ+e+xsDcaJ2pc1cWJr/gCUf3cliUtv+rI MnFvsbq4vEjhBE3Tr6LYPwivCzKpiEz2L0/oO8sShHhm/dfr9UGPUyeDO43phP+m NFQXoh6oCuleBXk/N874yAp9EDXtu3g9E1PUMMsbplXjiH6mLp8OWmRbvQ9Qw3zu BFtOonWViuPz5ILObc3i =o0XG -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink performance regression
On Sat, Dec 7, 2013 at 11:41 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier michael.paqu...@gmail.com wrote: IMHO is more elegant create a procedure to encapsulate the code to avoid redundancy. Yep, perhaps something like PQsetClientEncodingIfDifferent or similar would make sense. Well I think at this first moment we can just create a procedure inside the dblink contrib and not touch in libpq. Something like the attached. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c index a91a547..1feee22 100644 --- a/contrib/dblink/dblink.c +++ b/contrib/dblink/dblink.c @@ -112,6 +112,7 @@ static Relation get_rel_from_relname(text *relname_text, LOCKMODE lockmode, AclM static char *generate_relation_name(Relation rel); static void dblink_connstr_check(const char *connstr); static void dblink_security_check(PGconn *conn, remoteConn *rconn); +static void dblink_set_client_encoding(PGconn *conn); static void dblink_res_error(const char *conname, PGresult *res, const char *dblink_context_msg, bool fail); static char *get_connect_string(const char *servername); static char *escape_param_str(const char *from); @@ -209,8 +210,7 @@ typedef struct remoteConnHashEnt errdetail_internal(%s, msg))); \ } \ dblink_security_check(conn, rconn); \ -if (PQclientEncoding(conn) != GetDatabaseEncoding()) \ - PQsetClientEncoding(conn, GetDatabaseEncodingName()); \ +dblink_set_client_encoding(conn); \ freeconn = true; \ } \ } while (0) @@ -290,8 +290,7 @@ dblink_connect(PG_FUNCTION_ARGS) dblink_security_check(conn, rconn); /* attempt to set client encoding to match server encoding, if needed */ - if (PQclientEncoding(conn) != GetDatabaseEncoding()) - PQsetClientEncoding(conn, GetDatabaseEncodingName()); + dblink_set_client_encoding(conn); if (connname) { @@ -2622,6 +2621,13 @@ dblink_security_check(PGconn *conn, remoteConn *rconn) } } +static void +dblink_set_client_encoding(PGconn *conn) +{ + if (PQclientEncoding(conn) != GetDatabaseEncoding()) + PQsetClientEncoding(conn, GetDatabaseEncodingName()); +} + /* * For non-superusers, insist that the connstr specify a password. This * prevents a password from being picked up from .pgpass, a service file, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink performance regression
On 2013/12/08, at 10:50, Joe Conway m...@joeconway.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/07/2013 05:41 PM, Fabrízio de Royes Mello wrote: On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier michael.paqu...@gmail.com mailto:michael.paqu...@gmail.com wrote: IMHO is more elegant create a procedure to encapsulate the code to avoid redundancy. Yep, perhaps something like PQsetClientEncodingIfDifferent or similar would make sense. Well I think at this first moment we can just create a procedure inside the dblink contrib and not touch in libpq. Maybe a libpq function could be done for 9.4, but not for back branches. Agreed as this would change the spec of libpq between minor releases. Sorry for not being clear myself. -- we're only talking about two lines of added redundancy which is less lines of code than a new function would add. But if we create PQsetClientEncodingIfDifferent() (or whatever) we can remove those extra lines in 9.4 ;-) -- Michael (Sent from my mobile phone) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE sampling is too good
On 08/12/13 10:27, Greg Stark wrote: On Sat, Dec 7, 2013 at 8:25 PM, Josh Berkus j...@agliodbs.com wrote: The only approach which makes sense is to base it on a % of the table. In fact, pretty much every paper which has examined statistics estimation for database tables has determined that any estimate based on a less-than-5% sample is going to be wildly inaccurate. Not that 5% samples are 100% accurate, but at least they fit the 80/20 rule. This is nonsense. The math behind the deductions the planner makes is well understood and we know how to estimate the precision based on the sample size. There are some questions that really need a proportional sample such as n_distinct (and as Robert points out the MCV list) but the main motivation of the sample size is the histograms and those are used to answer questions which very clearly do not need a proportional sample. The statistics is very clear there. Using a proportional sample for the histograms would just be silly. It would be substituting a gut feeling for real statistics. The problems with using a proportional sample for things like n_distinct or the MCV list is that you're talking about sorting or hashing an unboundedly large set of values and storing an unboundedly large array in the statistics table. I don't think that would be tenable without dramatically changing the way process and store that data to be more scalable. Using a lossy counting algorithm and something more scalable than toasted arrays would be prerequisites I think. And as Robert mentions even if we solved those problems it wouldn't help n_distinct. You really need to read all the values to deal with n_distinct. This is the reason why implementing block-based sampling is critical; using our current take one row out of every page method, sampling 5% of the table means scanning the whole thing in most tables. We also need to decouple the number of MCVs we keep from the sample size. Certainly our existing sampling algo seems designed to maximize IO for the sample size. This would be helpful if you could specify what you mean by black-based sampling. The reason these previous pleas didn't go anywhere is not because we can't do math. It's because of the lack of specifics here. What we do *today* is called block-based sampling by the literature. What I'm saying is we need to change the block-based sampling that we do to extract more rows per block. We currently extract the correct number of rows to get a strong guarantee of uniformity. If we could get a weaker guarantee of being close enough to uniform samples for the deductions we want to make and get many more rows per block then we could read a lot fewer blocks. Or to put it another way people could increase the statistics target dramatically and still be reading the same number of blocks as today. In an ideal world perhaps we could have people reading 1% of the blocks they read today and get statistics targets 10x better than today. I suspect that the number of rows to sample should be something of the form: { N where N = a * 100 n = { (a * N) / 100 where a * 100 N = 1 { (a * SQRT(N)) / 100 where N 1 n: the number of rows to sample N: total number of rows a: configurable coefficient (1 = a = 100) For very large numbers of rows, like 10^8, taking a constant fraction will over sample for most purposes, hence the square root. a Nn sampled% 1 1 100 1% 100 11 100% 1 10^81 0.01% 100 10^8 10^6 1% 1 10^10 10^5 0.001% 100 10^10 10^7 0.01% For very large values of N, you can get can still get a representative sample with a very small fraction of rows sampled. Hmm... Yes, I can see some issues, once I tried out with test values! However. it might inspire a more useful and practical approach! Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] pg_ctl always uses the same event source
From: MauMau maumau...@gmail.com I've removed a limitation regarding event log on Windows with the attached patch. I hesitate to admit this is a bug fix and want to regard this an improvement, but maybe it's a bug fix from users' perspective. Actually, I received problem reports from some users. I've done a small correction. I removed redundant default value from the pg_ctl.c. Regards MauMau pg_ctl_eventsrc_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] pg_ctl fails with config-only directory
On Sat, Dec 7, 2013 at 6:02 PM, MauMau maumau...@gmail.com wrote: From: Amit Kapila amit.kapil...@gmail.com Today, I had again gone through all the discussion that happened at that time related to this problem and I found that later in discussion it was discussed something on lines as your Approach-2, please see the link http://www.postgresql.org/message-id/503a879c.6070...@dunslane.net Thank you again. I'm a bit relieved to see similar discussion. Wouldn't the other way to resolve this problem be reinvoke pg_ctl in non-restricted mode for the case in question? It's effectively the same as not checking admin privileges. And direct invocation of postgres -C/--describe-config still cannot be helped. Yeah, that is the only point of reinvoke pg_ctl in non-restricted mode, that it should be only allowed through pg_ctl, but not directly with postgres -C. In anycase, I think now we have link for the discussion and patch as well for other Approach, so if the consensus is to modify postgres code such that we don't need to check for admin privs for -C/--describe-config option, then the patch proposed by you can be taken else the other patch can be used. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extra functionality to createuser
On Fri, Dec 6, 2013 at 10:31 AM, Peter Eisentraut pete...@gmx.net wrote: On Wed, 2013-11-20 at 11:23 -0500, Christopher Browne wrote: I note that similar (with not quite identical behaviour) issues apply to the user name. Perhaps the resolution to this is to leave quoting issues to the administrator. That simplifies the problem away. How about only one role name per -g option, but allowing the -g option to be repeated? I think that might simplify the problem and patch, but do you think it is okay to have inconsistency for usage of options between Create User statement and this utility? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] About shared cache invalidation mechanism
I know that all invalid cache messages are stored in the shmInvalidationBuffer ring buffer and that they should be consumed by all other backends to keep their own cache fresh. Since there may be some stragglers which process the SI message quite slow, we use *catchup* interrupt(signal) to accelerate their cosuming shared invalid messages. Here comes my questions : (1). When the current number of messages in the shmInvalidationBuffer exceeds a threshold, it needs to be cleaned up by using SICleanupQueue. After that, if the number still exceeds MAXNUMMESSAGES/2, threshold will be calculated by the following formula: Threshold = (numMsgs/CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM (2). For those slow backends, if their *nextMsgNum* value is less than *lowbound*, they will be reset, and the *lowbound* is calculated by lowbound = maxMsgNum - MAXNUMMESSAGES + minFree, and if their *nextMsgNum* value is less than *minsig*, they will get catchup signals to speed up, *minsig* is calculated by minsig = maxMsgNum - MAXNUMMESSAGES/2 Here, I want to ask why threshold, lowbound and minsig are calculated like that ? Do the three formulas have any performance considerations when designed ? I have searched through the old mail list archives, but found nothing about these(these changes emerged in pg8.4 firstly), any help would be appreciated.