[HACKERS] proposal 9.4 plpgsql: allows access to call stack from GET DIAGNOSTICS statement
Hello I propose enhancing GET DIAGNOSTICS statement about new field PG_CONTEXT. It is similar to GET STACKED DIAGNOSTICS' PG_EXCEPTION_CONTEXT. Motivation for this proposal is possibility to get call stack for debugging without raising exception. This code is based on cleaned code from Orafce, where is used four years without any error reports. CREATE OR REPLACE FUNCTION public.inner(integer) RETURNS integer LANGUAGE plpgsql AS $function$ declare _context text; begin get diagnostics _context = pg_context; raise notice '***%***', _context; return 2 * $1; end; $function$ postgres=# select outer_outer(10); NOTICE: ***PL/pgSQL function inner(integer) line 4 at GET DIAGNOSTICS PL/pgSQL function outer(integer) line 3 at RETURN PL/pgSQL function outer_outer(integer) line 3 at RETURN*** CONTEXT: PL/pgSQL function outer(integer) line 3 at RETURN PL/pgSQL function outer_outer(integer) line 3 at RETURN outer_outer ─ 20 (1 row) Ideas, comments? Regards Pavel Stehule get_diagnostics_context_initial.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Using indexes for partial index builds
Hello, After a discussion on IRC in #postgresql, I had a feature suggestion and it was suggested I write it up here. I have a large (200GB, 1.7b rows) table with a number of columns, but the two of interest here are a hstore column, tags and a postgis geometry column, geom. There is a GIN index on tags and a gist index on geom. These took about 36-48 hours to build in total. Obviously index building on a table this size is not trivial. Periodically I want to do a number of specialized queries on objects with a particular tag or in a particular area. To do this I often want to create a partial index. For example, I created the index btree ((tags - 'name_1'::text) text_pattern_ops) WHERE tags ? 'name_1'::text. My understanding is to create this index PostgreSQL does a scan of the entire table, even though the GIN index on tags could be used to identify which rows could belong in the index. Where the WHERE condition selects only a small portion of the table this is scanning a lot more data than is necessary. Another case where it would be useful is when I am conducting a detailed analysis of some aspect of the rows in a particular city. This leads to all the queries being of the form SELECT ... FROM ... WHERE is_in_my_area(geom)[1]. My current project is doing analysis involving addresses. The ability to create an index like btree((tags - 'addr:housenumber'), (tags - 'addr:street'), (tags - 'addr:city')) WHERE is_in_my_area(geom) in a reasonable time would allow me to use a view instead of copying the local area to a temporary table and indexing that table. The local area is about 350k rows, or about 0.02% of the database. [1] The actual function for determining if it's in my area is long and not really essential to the point here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - assign result of query to psql variable
Hello 2013/2/1 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: here is patch related to your proposal I looked at this a bit. It's getting there, though I still don't trust the places where you've chosen to clear the prefix setting. (Looking at it, I'm now not sure that I trust the implementation of \g either.) However, what I wanted to ask about was this: + if (PQgetisnull(result, 0, i)) + value = pset.popt.nullPrint ? pset.popt.nullPrint : ; + else + value = PQgetvalue(result, 0, i); What's the argument for using nullPrint here? ISTM that's effectively a form of escaping, and I'd not expect that to get applied to values going into variables, any more than any other formatting we do when printing results. Admittedly, if we just take the PQgetvalue result blindly, there'll be no way to tell the difference between empty-string and NULL results. But I'm not convinced that this approach is better. It would certainly need more than no documentation. I have not strong opinion about storing NULL value - and nullPrint is a best from simple variants - possible variants a) don't store NULL values - and remove existing variable when NULL be assigned - it is probably best, but should be confusing for users b) implement flag IS NULL - for variables c) use nullPrint d) use empty String @d is subset of @c, and I think so it put some better possibilities with only two lines more @a is probably best - but significant change - not hard to implement it SELECT NULL AS x \g pref_ SELECT :'pref_' IS NULL; is can be nice but it should be premature optimization too - nullPrint is enough for typical use cases Regards Pavel Regards Pavel regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On 1 February 2013 23:56, Tom Lane t...@sss.pgh.pa.us wrote: Well, if we were tracking the latest value in shared memory, we could certainly clamp to that to ensure it didn't go backwards. The problem is where to find storage for a per-DB value. Adding new data columns to catalogs in backbranches seems like a great reason to have an hstore column on every catalog table. That way we can just add anything we need without causing other problems. Obviously something for the future. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On 1 February 2013 23:56, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Feb 1, 2013 at 2:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: In any case, I no longer have much faith in the idea that letting GetOldestXmin go backwards is really safe. That is admittedly kind of weird behavior, but I think one could equally blame this on CLUSTER. This is hardly the first time we've had to patch CLUSTER's handling of TOAST tables (cf commits 21b446dd0927f8f2a187d9461a0d3f11db836f77, 7b0d0e9356963d5c3e4d329a917f5fbb82a2ef05, 83b7584944b3a9df064cccac06822093f1a83793) and it doesn't seem unlikely that we might go the full ten rounds. Yeah, but I'm not sure whether CLUSTER is the appropriate blamee or whether it's more like the canary in the coal mine, first to expose problematic behaviors elsewhere. The general problem here is really that we're cleaning out toast tuples while the referencing main-heap tuple still physically exists. How safe do you think that is? Agree that CLUSTER is just first to hit the problem. As long as the problem exists, we have issues. That did not ever happen before we decoupled autovacuuming of main and toast tables, either --- so a good case could be made that that idea is fundamentally broken. It's broken, but not fundamentally. We can decouple the autovacuuming, as long as we don't decouple the xid used. Fixing GetOldestXmin() seems like the easiest thing to do for backbranches, but seems a little heavy handed for a general fix - though doing that will probably close any future thinkos of similar nature. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - assign result of query to psql variable
On Sat, Feb 2, 2013 at 7:30 PM, Pavel Stehule pavel.steh...@gmail.com wrote: possible variants a) don't store NULL values - and remove existing variable when NULL be assigned - it is probably best, but should be confusing for users b) implement flag IS NULL - for variables c) use nullPrint d) use empty String +1 for a). If users want to determine whether the result was NULL, or want to use substitute string for NULL result, they can use coalesce in SELECT clause. Anyway the feature should be documented clearly. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On 2013-02-01 15:09:34 -0800, Jeff Janes wrote: On Fri, Feb 1, 2013 at 2:34 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-02-01 14:05:46 -0800, Jeff Janes wrote: As far as I can tell this bug kicks in when your cluster gets to be older than freeze_min_age, and then lasts forever after. After that point pretty much every auto-vacuum inspired by update/deletion activity will get promoted to a full table scan. (Which makes me wonder how much field-testing the vm-only vacuum has received, if it was rarely happening in practice due to this bug.) I think you're misreading the code. freezeTableLimit is calculated by limit = ReadNewTransactionId() - freezetable; which is always relative to the current xid. The bug was that freezetable had the wrong value in autovac due to freeze_min_age being used instead of freeze_table_age. Right. Since freeze_min_age was mistakenly being used, the limit would be 50 million in the past (rather than 150 million) under defaults. But since the last full-table vacuum, whenever that was, used freeze_min_age for its intended purpose, that means the 50 million in the past *at the time of that last vacuum* is the highest that relfrozenxid can be. And that is going to be further back than 50 million from right now, so the vacuum will always be promoted to a full scan. Oh, wow. Youre right. I shouldn't answer emails after sport with cramped fingers on a friday night... And I should have thought about this scenario, because I essentially already explained it upthread, just with a different set of variables. This is rather scary. How come nobody noticed that this major performance improvement was effectively disabled for that long? I wonder if Kevin's observations about the price of autovac during OLTPish workloads isn't at least partially caused by this. It will cause lots of io prematurely because it scans far more than it should and a VACUUM FREEZE will push it off. As an aside, it does seem like log_autovacuum_min_duration=0 should log whether a scan_all was done, and if so what relfrozenxid got set to. But looking at where the log message is generated, I don't know where to retrieve that info. Yes, I agree, I already had been thinking about that because its really hard to get that information right now. It seems easy enough to include it in the ereport() at the bottom of lazy_vacuum_rel, we determine scan_all in that function, so that seems ok? For head I would actually vote for two data points, full_table_scan: yes/no, skipped_percentage..., both are already available, so it seems like it should be an easy thing to do. I'd like to do this for 9.3, agreed? I would even like to add it to the back branches, but I guess I cannot convince people of that... [1] I don't know why it is that a scan_all vacuum with a freeze_min_age of 50m (or a freezeLimit of 50 million ago) will not set relfrozenxid to a higher value than that if it discovers that it can, but it doesn't seem to. There currently is no code to track whats the oldest observed xid, so a simple implementation limitiation. Making that code better might be rather worthwile if youre loading your table in a batch and don't touch it later anymore... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On 2013-02-01 16:59:52 -0500, Robert Haas wrote: I don't think I really understand the origin of the formula, so perhaps if someone would try to characterize why it seems to behave reasonably that would be helpful (at least to me). f(deadtuples, relpages, age) = deadtuples/relpages + e ^ (age*ln(relpages)/2^32) To maybe make that discussion go more quickly let me kvetch about a few things to kick things off: I am not too happy with the formula yet either, but it seems youve started the discussion into the right direction... - It's probably important to have a formula where we can be sure that the wrap-around term will eventually dominate the dead-tuple term, with enough time to spare to make sure nothing really bad happens; on the other hand, it's also desirable to avoid the case where a table that has just crossed the threshold for wraparound vacuuming doesn't immediately shoot to the top of the list even if it isn't truly urgent. It's unclear to me just from looking at this formula how well the second term meets those goals. I just wanted to mention that if everything goes well, we won't *ever* get to an anti-wraparound-vacuum. Normally the table should cross the vacuum_table_age barrier earlier and promote a normal vacuum to a full-table vacuum which will set relfrozenxid to a new and lower value and thus prevent anti-wraparound vacuums from occurring. So priorizing anti-wraparound vacuums immediately and heavily doesn't seem to be too bad. - More generally, it seems to me that we ought to be trying to think about the units in which these various quantities are measured. Each term ought to be unit-less. So perhaps the first term ought to divide dead tuples by total tuples, which has the nice property that the result is a dimensionless quantity that never exceeds 1.0. Then the second term can be scaled somehow based on that value. I think we also need to be careful to not try to get too elaborate on this end. Once the general code for priorization is in, the exact priorization formula can be easily incrementally tweaked. Just about any half-way sensible priorization is better than what we have right now and we might discover new effects once we do marginally better. Imo the browne_strength field should be called 'priority' and the priorization calculation formula should be moved qinto an extra function. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On 2013-02-01 19:24:02 -0500, Tom Lane wrote: * There are also replication-related effects: a walsender * process can set its xmin based on transactions that are no longer running * in the master but are still being replayed on the standby, thus possibly * making the GetOldestXmin reading go backwards. In this case there is a * possibility that we lose data that the standby would like to have, but * there is little we can do about that --- data is only protected if the * walsender runs continuously while queries are executed on the standby. * (The Hot Standby code deals with such cases by failing standby queries * that needed to access already-removed data, so there's no integrity bug.) This is just bogus. Why don't we make it a requirement on walsenders that they never move their advertised xmin backwards (or initially set it to less than the prevailing global xmin)? There's no real benefit to allowing them to try to move the global xmin backwards, because any data that they might hope to protect that way could be gone already. The problem imo has multiple sides. a) We currently don't have a way to assign the minimal valid xmin to the current backend. I already need that for logical replication, so we might just as well add it now: http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=commitdiff;h=84f6ed9550b65b0584589977877e9a997daeb61d With that its just a matter of taking ProcArrayLock exlusively before calling GetOldestXmin and assigning it to MyPgXact-xmin before releasing it. b) We don't assign the xmin early enough, we only set it when the first feedback message arrives, but we should set it when walsender starts streaming. c) After a disconnect the feedback message will rather likely ask for an xmin horizon thats not valid anymore on the primary. If the disconnect was short enough often enough that doesn't matter because nothing has been cleared out, but it doesn't really work all that well. Thats still better than setting it to the currently valid minimal xmin horizon because it prevents cleanup from that moment on. I don't see how this can be significantly improved without persistent knowledge about standbys. * The return value is also adjusted with vacuum_defer_cleanup_age, so * increasing that setting on the fly is another easy way to make * GetOldestXmin() move backwards, with no consequences for data integrity. And as for that, it's been pretty clear for awhile that allowing vacuum_defer_cleanup_age to change on the fly was a bad idea we'd eventually have to undo. The day of reckoning has arrived: it needs to be PGC_POSTMASTER. Not sure if that would really solve anything, similar problems seem to exist even if the postmaster has been restarted inbetween. Btw, I wonder if that comment in GetOldestXmin still has good justification, now that we allow cascading standbys: Also note that we intentionally don't apply vacuum_defer_cleanup_age on standby servers. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - assign result of query to psql variable
2013/2/2 Shigeru Hanada shigeru.han...@gmail.com: On Sat, Feb 2, 2013 at 7:30 PM, Pavel Stehule pavel.steh...@gmail.com wrote: possible variants a) don't store NULL values - and remove existing variable when NULL be assigned - it is probably best, but should be confusing for users b) implement flag IS NULL - for variables c) use nullPrint d) use empty String +1 for a). If users want to determine whether the result was NULL, or want to use substitute string for NULL result, they can use coalesce in SELECT clause. Anyway the feature should be documented clearly. yes, this has one other advantage - it doesn't block possible enhancing variables about NULL support in future. And other - it doesn't depends on psql settings Regards Pavel -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY FREEZE has no warning
On Fri, Feb 01, 2013 at 12:57:18PM -0500, Bruce Momjian wrote: On Tue, Jan 29, 2013 at 08:34:24PM -0500, Noah Misch wrote: On Fri, Jan 25, 2013 at 11:28:58PM -0500, Bruce Momjian wrote: BEGIN; TRUNCATE vistest; SAVEPOINT s1; COPY vistest FROM stdin CSV FREEZE; ERROR: cannot perform FREEZE because of previous table activity in the current transaction COMMIT; Clearly it was truncated in the same transaction, but the savepoint somehow invalidates the freeze. There is a C comment about it: The savepoint prevents the COPY FREEZE, because COPY FREEZE needs the table to have been created or truncated in the current *sub*transaction. Issuing RELEASE s1 before the COPY makes it work again, for example. * BEGIN; * TRUNCATE t; * SAVEPOINT save; * TRUNCATE t; * ROLLBACK TO save; * COPY ... This is different. The table was truncated in the current subtransaction, and it's safe in principle to apply the optimization. Due to an implementation artifact, we'll reject it anyway. OK, so, should we change the error message: cannot perform FREEZE because of transaction activity after table creation or truncation to cannot perform FREEZE because the table was not created or truncated in the current subtransaction or do we need to keep the transaction activity weasel wording because of the second case you listed above? I am suspecting the later. Let's touch on the exception in passing by using the phrase last truncated, giving this wording for both the second and the third COPY FREEZE error sites: cannot perform FREEZE because the table was not created or last truncated in the current subtransaction -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY FREEZE has no warning
On Sat, Feb 2, 2013 at 09:51:13AM -0500, Noah Misch wrote: OK, so, should we change the error message: cannot perform FREEZE because of transaction activity after table creation or truncation to cannot perform FREEZE because the table was not created or truncated in the current subtransaction or do we need to keep the transaction activity weasel wording because of the second case you listed above? I am suspecting the later. Let's touch on the exception in passing by using the phrase last truncated, giving this wording for both the second and the third COPY FREEZE error sites: cannot perform FREEZE because the table was not created or last truncated in the current subtransaction Well, so you are saying that there really isn't any use-visible logic for those messages to be different, i.e. that the transaction id can be set to invalid even if we created/truncated in the same transaction, but not the same subtransaction? The comparisons that trigger the two messages are: if (cstate-rel-rd_createSubid != InvalidSubTransactionId || cstate-rel-rd_newRelfilenodeSubid != InvalidSubTransactionId) and if (cstate-rel-rd_createSubid != GetCurrentSubTransactionId() || cstate-rel-rd_newRelfilenodeSubid != GetCurrentSubTransactionId()) The first comparison is creation, the second, truncation. Please confirm and I will make the change, or you can. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Turning auto-analyze off (was Re: [GENERAL] Unusually high IO for autovacuum worker)
On Fri, Feb 1, 2013 at 5:54 PM, Pavan Deolasee pavan.deola...@gmail.comwrote: There is another problem that I noticed while looking at this case. The analyze took close to 500sec on a fairly good hardware (40GB RAM, 10K rpm disks on RAID10) because many large child tables were scanned at once. Just a small correction to get a good benchmark The ~500 sec analyze time was on a test VM running on a i5 2.4 Ghz with 2 dedicate cores, 4 GB of RAM and stored on a notebook HDD. The partitioned data was about 80GB. On our production environment (described by Pavan) it took ~90 second for ~55GB of data in 8 child/partition tables (we stopped the import of partitioned data when we discovered this issue - we COPYed and TRUNCATEd partitions to speed up upgrade procedure from 8.4 to 9.1 by pg_dump/pg_restore). Vlad
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On Fri, Feb 1, 2013 at 7:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Having said that, I agree that a fix in GetOldestXmin() would be nice if we could find one, but since the comment describes at least three different ways the value can move backwards, I'm not sure that there's really a practical solution there, especially if you want something we can back-patch. Actually, wait a second. As you say, the comment describes three known ways to make it go backwards. It strikes me that all three are fixable: * if allDbs is FALSE and there are no transactions running in the current * database, GetOldestXmin() returns latestCompletedXid. If a transaction * begins after that, its xmin will include in-progress transactions in other * databases that started earlier, so another call will return a lower value. The reason this is a problem is that GetOldestXmin ignores XIDs of processes that are connected to other DBs. It now seems to me that this is a flat-out bug. It can ignore their xmins, but it should include their XIDs, because the point of considering those XIDs is that they may contribute to the xmins of snapshots computed in the future by processes in our own DB. And snapshots never exclude any XIDs on the basis of which DB they're in. (They can't really, since we can't know when the snap is taken whether it might be used to examine shared catalogs.) This idea is superficially appealing, but I can't get excited about it, because it will cause GetOldestXmin() to compute older (that is, more conservative) XIDs than it does today, which means we'll vacuum away less. Considering the amount of pain we have around vacuum today, that's surely got to be going the wrong direction. I'm inclined to think that the root of the problem is that the computation is too conservative already. I mean, what do we really need to include in the computation in order not to vacuum away any data that's still needed? There are basically two concerns: - You can't remove data that any snapshot in the current database might try to read. So you have to include the xmins of those snapshots in your oldest-xmin value. But, not really, because those xmins were computed by looking at all running XIDs across the whole database, and if their oldest xmin value came from some XID in another database, it's not relevant. What you want (but of course, can't get) is the db-specific xmin horizon for each other snapshot. - You have to worry about new snapshots that are about to appear but haven't yet. But, really, how big of a concern is this? No transactions can exit the running set while we hold ProcArrayLock, so it seems to me that there ought to be a pretty tight limit on what can show up here. For example, if we've already taken a snapshot, surely a subsequently taken snapshot can't really need an older xmin horizon than the one we computed for our own snapshot. In theory, it seems like we might be able to solve these problems by having each backend advertise TWO xmin horizons: one representing the oldest xmin it might be able to see in a system catalog, and the other representing the oldest xmin it might be able to see in its own database. This would allow significantly more precise tracking of what is really needed in the current database, but there's an obvious implementation concern, which is that GetSnapshotData is hot enough that individual instructions matter. * There are also replication-related effects: a walsender * process can set its xmin based on transactions that are no longer running * in the master but are still being replayed on the standby, thus possibly * making the GetOldestXmin reading go backwards. In this case there is a * possibility that we lose data that the standby would like to have, but * there is little we can do about that --- data is only protected if the * walsender runs continuously while queries are executed on the standby. * (The Hot Standby code deals with such cases by failing standby queries * that needed to access already-removed data, so there's no integrity bug.) This is just bogus. Why don't we make it a requirement on walsenders that they never move their advertised xmin backwards (or initially set it to less than the prevailing global xmin)? There's no real benefit to allowing them to try to move the global xmin backwards, because any data that they might hope to protect that way could be gone already. I think there is already some code that aims to do this, but I might be misremembering. IOW, this comment might already be incorrect. * The return value is also adjusted with vacuum_defer_cleanup_age, so * increasing that setting on the fly is another easy way to make * GetOldestXmin() move backwards, with no consequences for data integrity. And as for that, it's been pretty clear for awhile that allowing vacuum_defer_cleanup_age to change on the fly was a bad idea we'd eventually
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On Fri, Feb 1, 2013 at 6:56 PM, Tom Lane t...@sss.pgh.pa.us wrote: That is admittedly kind of weird behavior, but I think one could equally blame this on CLUSTER. This is hardly the first time we've had to patch CLUSTER's handling of TOAST tables (cf commits 21b446dd0927f8f2a187d9461a0d3f11db836f77, 7b0d0e9356963d5c3e4d329a917f5fbb82a2ef05, 83b7584944b3a9df064cccac06822093f1a83793) and it doesn't seem unlikely that we might go the full ten rounds. Yeah, but I'm not sure whether CLUSTER is the appropriate blamee or whether it's more like the canary in the coal mine, first to expose problematic behaviors elsewhere. The general problem here is really that we're cleaning out toast tuples while the referencing main-heap tuple still physically exists. How safe do you think that is? That did not ever happen before we decoupled autovacuuming of main and toast tables, either --- so a good case could be made that that idea is fundamentally broken. I confess I'm not entirely sanguine about that. A similar issue arises with index tuples referencing heap tuples, and we go to great lengths to make sure that the last vestiges of the heap tuple can't be removed until the index tuples are well and truly dead. If we were to adopt the same approach here, it would mean that, instead of consulting XIDs at all, we'd remove TOAST pointers only when there were provably no main-table tuples pointing to them. I'm not really proposing that, but it's an interesting thought-experiment. What we're actually doing is relying on values to match exactly that were never intended as more than an approximation. Can you point me to the commit that decoupled this? What was the motivation for it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
On Fri, Feb 1, 2013 at 12:04 AM, Amit Kapila amit.kap...@huawei.com wrote: I think user should be aware of effect before using SET commands, as these are used at various levels (TRANSACTION, SESSION, ...). Ideally, sure. But these kinds of mistakes are easy to make. That's why LOCK and DECLARE CURSOR already emit errors in this case - why should this one be any different? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On Fri, Feb 1, 2013 at 6:09 PM, Jeff Janes jeff.ja...@gmail.com wrote: As an aside, it does seem like log_autovacuum_min_duration=0 should log whether a scan_all was done, and if so what relfrozenxid got set to. That would be nifty. [1] I don't know why it is that a scan_all vacuum with a freeze_min_age of 50m (or a freezeLimit of 50 million ago) will not set relfrozenxid to a higher value than that if it discovers that it can, but it doesn't seem to. That also seems very much worth fixing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On Sat, Feb 2, 2013 at 8:41 AM, Andres Freund and...@2ndquadrant.com wrote: - It's probably important to have a formula where we can be sure that the wrap-around term will eventually dominate the dead-tuple term, with enough time to spare to make sure nothing really bad happens; on the other hand, it's also desirable to avoid the case where a table that has just crossed the threshold for wraparound vacuuming doesn't immediately shoot to the top of the list even if it isn't truly urgent. It's unclear to me just from looking at this formula how well the second term meets those goals. I just wanted to mention that if everything goes well, we won't *ever* get to an anti-wraparound-vacuum. Normally the table should cross the vacuum_table_age barrier earlier and promote a normal vacuum to a full-table vacuum which will set relfrozenxid to a new and lower value and thus prevent anti-wraparound vacuums from occurring. So priorizing anti-wraparound vacuums immediately and heavily doesn't seem to be too bad. IMHO, this is hopelessly optimistic. Yes, it's intended to work that way. But INSERT-only or INSERT-mostly tables are far from an uncommon use case; and in fact they're probably the most common cause of pain in this area. You insert a gajillion tuples, and vacuum never kicks off, and then eventually you either update some tuples or hit autovacuum_freeze_max_age and suddenly, BAM, you get this gigantic vacuum that rewrites the entire table. And then you open a support ticket with your preferred PostgreSQL support provider and say something like WTF?. - More generally, it seems to me that we ought to be trying to think about the units in which these various quantities are measured. Each term ought to be unit-less. So perhaps the first term ought to divide dead tuples by total tuples, which has the nice property that the result is a dimensionless quantity that never exceeds 1.0. Then the second term can be scaled somehow based on that value. I think we also need to be careful to not try to get too elaborate on this end. Once the general code for priorization is in, the exact priorization formula can be easily incrementally tweaked. Just about any half-way sensible priorization is better than what we have right now and we might discover new effects once we do marginally better. I agree. It would be nice to have some way of measuring the positive or negative impact of what we introduce, too, but I don't have a good idea what that would be. Imo the browne_strength field should be called 'priority' and the priorization calculation formula should be moved qinto an extra function. Yeah, or maybe vacuum_priority, since that would be easier to grep for. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY FREEZE has no warning
Bruce Momjian br...@momjian.us writes: Well, so you are saying that there really isn't any use-visible logic for those messages to be different, No, and in fact the whole block of code is badly written because it conflates two unrelated tests. I guess somebody was trying to save a couple of nanoseconds by not calling GetCurrentSubTransactionId if a previous test had failed, but really why should we care about that number of cycles in COPY preliminaries? The code ought to be more like this: /* comment about skipping FSM or WAL here */ if (cstate-rel-rd_createSubid != InvalidSubTransactionId || cstate-rel-rd_newRelfilenodeSubid != InvalidSubTransactionId) { hi_options |= HEAP_INSERT_SKIP_FSM; if (!XLogIsNeeded()) hi_options |= HEAP_INSERT_SKIP_WAL; } /* comment about when we can perform FREEZE here */ if (cstate-freeze) { if (!ThereAreNoPriorRegisteredSnapshots() || !ThereAreNoReadyPortals()) ereport(ERROR, (ERRCODE_INVALID_TRANSACTION_STATE, errmsg(cannot perform FREEZE because of prior transaction activity))); if (cstate-rel-rd_createSubid != GetCurrentSubTransactionId() cstate-rel-rd_newRelfilenodeSubid != GetCurrentSubTransactionId()) ereport(ERROR, (ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE, errmsg(cannot perform FREEZE because the table was not created or truncated in the current subtransaction))); hi_options |= HEAP_INSERT_FROZEN; } regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - assign result of query to psql variable
Shigeru Hanada shigeru.han...@gmail.com writes: On Sat, Feb 2, 2013 at 7:30 PM, Pavel Stehule pavel.steh...@gmail.com wrote: possible variants a) don't store NULL values - and remove existing variable when NULL be assigned - it is probably best, but should be confusing for users b) implement flag IS NULL - for variables c) use nullPrint d) use empty String +1 for a). If users want to determine whether the result was NULL, or want to use substitute string for NULL result, they can use coalesce in SELECT clause. Anyway the feature should be documented clearly. Yeah, I was considering that one too. Let's do it that way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY FREEZE has no warning
On Sat, Feb 2, 2013 at 12:09:05PM -0500, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Well, so you are saying that there really isn't any use-visible logic for those messages to be different, No, and in fact the whole block of code is badly written because it conflates two unrelated tests. I guess somebody was trying to save a couple of nanoseconds by not calling GetCurrentSubTransactionId if a previous test had failed, but really why should we care about that number of cycles in COPY preliminaries? The code ought to be more like this: /* comment about skipping FSM or WAL here */ if (cstate-rel-rd_createSubid != InvalidSubTransactionId || cstate-rel-rd_newRelfilenodeSubid != InvalidSubTransactionId) { hi_options |= HEAP_INSERT_SKIP_FSM; if (!XLogIsNeeded()) hi_options |= HEAP_INSERT_SKIP_WAL; } /* comment about when we can perform FREEZE here */ if (cstate-freeze) { if (!ThereAreNoPriorRegisteredSnapshots() || !ThereAreNoReadyPortals()) ereport(ERROR, (ERRCODE_INVALID_TRANSACTION_STATE, errmsg(cannot perform FREEZE because of prior transaction activity))); if (cstate-rel-rd_createSubid != GetCurrentSubTransactionId() cstate-rel-rd_newRelfilenodeSubid != GetCurrentSubTransactionId()) ereport(ERROR, (ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE, errmsg(cannot perform FREEZE because the table was not created or truncated in the current subtransaction))); hi_options |= HEAP_INSERT_FROZEN; } Yes, I found the blocking odd too --- the test for InvalidSubTransactionId is used by hi_options, and for freeze checking. I assumed != InvalidSubTransactionId and != GetCurrentSubTransactionId() had different meanings for freeze checking. I compounded the problem because originally there was no FREEZE failure so no action was taken if != InvalidSubTransactionId. Applied patch attached. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c new file mode 100644 index 49cc8dd..523c1e0 *** a/src/backend/commands/copy.c --- b/src/backend/commands/copy.c *** CopyFrom(CopyState cstate) *** 1996,2031 hi_options |= HEAP_INSERT_SKIP_FSM; if (!XLogIsNeeded()) hi_options |= HEAP_INSERT_SKIP_WAL; ! /* ! * Optimize if new relfilenode was created in this subxact or ! * one of its committed children and we won't see those rows later ! * as part of an earlier scan or command. This ensures that if this ! * subtransaction aborts then the frozen rows won't be visible ! * after xact cleanup. Note that the stronger test of exactly ! * which subtransaction created it is crucial for correctness ! * of this optimisation. ! */ ! if (cstate-freeze) ! { ! if (!ThereAreNoPriorRegisteredSnapshots() || !ThereAreNoReadyPortals()) ! ereport(ERROR, ! (ERRCODE_INVALID_TRANSACTION_STATE, ! errmsg(cannot perform FREEZE because of prior transaction activity))); ! if (cstate-rel-rd_createSubid == GetCurrentSubTransactionId() || ! cstate-rel-rd_newRelfilenodeSubid == GetCurrentSubTransactionId()) ! hi_options |= HEAP_INSERT_FROZEN; ! else ! ereport(ERROR, ! (ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE, ! errmsg(cannot perform FREEZE because of transaction activity after table creation or truncation))); ! } } - else if (cstate-freeze) - ereport(ERROR, - (ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE, - errmsg(cannot perform FREEZE because the table was not created or truncated in the current transaction))); /* * We need a ResultRelInfo so we can use the regular executor's --- 1996,2027 hi_options |= HEAP_INSERT_SKIP_FSM; if (!XLogIsNeeded()) hi_options |= HEAP_INSERT_SKIP_WAL; + } ! /* ! * Optimize if new relfilenode was created in this subxact or ! * one of its committed children and we won't see those rows later ! * as part of an earlier scan or command. This ensures that if this ! * subtransaction aborts then the frozen rows won't be visible ! * after xact cleanup. Note that the stronger test of exactly ! * which subtransaction created it is crucial for correctness ! * of this optimisation. ! */ ! if (cstate-freeze) ! { ! if (!ThereAreNoPriorRegisteredSnapshots() || !ThereAreNoReadyPortals()) ! ereport(ERROR, ! (ERRCODE_INVALID_TRANSACTION_STATE, ! errmsg(cannot perform FREEZE because of prior transaction activity))); ! if (cstate-rel-rd_createSubid != GetCurrentSubTransactionId() ! cstate-rel-rd_newRelfilenodeSubid != GetCurrentSubTransactionId()) ! ereport(ERROR, !
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On 2 February 2013 14:24, Andres Freund and...@2ndquadrant.com wrote: b) We don't assign the xmin early enough, we only set it when the first feedback message arrives, but we should set it when walsender starts streaming. That's easy to fix. c) After a disconnect the feedback message will rather likely ask for an xmin horizon thats not valid anymore on the primary. If the disconnect was short enough often enough that doesn't matter because nothing has been cleared out, but it doesn't really work all that well. Thats still better than setting it to the currently valid minimal xmin horizon because it prevents cleanup from that moment on. I don't see how this can be significantly improved without persistent knowledge about standbys. We could delay startup of the standby until the xmin on the standby reaches the xmin on the master. So when the standby has hot_standby_feedback = on, at standby connection we set the xmin of the walsender to be the current value on the master, then we disallow connections on standby until we have replayed up to that xmin on the standby. That way the xmin of the walsender never goes backwards nor do we get cancelations on the standby. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On 2 February 2013 00:24, Tom Lane t...@sss.pgh.pa.us wrote: * The return value is also adjusted with vacuum_defer_cleanup_age, so * increasing that setting on the fly is another easy way to make * GetOldestXmin() move backwards, with no consequences for data integrity. And as for that, it's been pretty clear for awhile that allowing vacuum_defer_cleanup_age to change on the fly was a bad idea we'd eventually have to undo. The day of reckoning has arrived: it needs to be PGC_POSTMASTER. Agreed, will fix. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On 2 February 2013 00:24, Tom Lane t...@sss.pgh.pa.us wrote: * if allDbs is FALSE and there are no transactions running in the current * database, GetOldestXmin() returns latestCompletedXid. If a transaction * begins after that, its xmin will include in-progress transactions in other * databases that started earlier, so another call will return a lower value. The reason this is a problem is that GetOldestXmin ignores XIDs of processes that are connected to other DBs. It now seems to me that this is a flat-out bug. It can ignore their xmins, but it should include their XIDs, because the point of considering those XIDs is that they may contribute to the xmins of snapshots computed in the future by processes in our own DB. And snapshots never exclude any XIDs on the basis of which DB they're in. (They can't really, since we can't know when the snap is taken whether it might be used to examine shared catalogs.) Agree thats a bug. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On 2013-02-02 18:32:44 +, Simon Riggs wrote: On 2 February 2013 14:24, Andres Freund and...@2ndquadrant.com wrote: b) We don't assign the xmin early enough, we only set it when the first feedback message arrives, but we should set it when walsender starts streaming. That's easy to fix. Not trivial, but not too hard, yes. When the standby initially connects we don't yet know which xid will be required because consistency hasn't yet been achieved. c) After a disconnect the feedback message will rather likely ask for an xmin horizon thats not valid anymore on the primary. If the disconnect was short enough often enough that doesn't matter because nothing has been cleared out, but it doesn't really work all that well. Thats still better than setting it to the currently valid minimal xmin horizon because it prevents cleanup from that moment on. I don't see how this can be significantly improved without persistent knowledge about standbys. We could delay startup of the standby until the xmin on the standby reaches the xmin on the master. So when the standby has hot_standby_feedback = on, at standby connection we set the xmin of the walsender to be the current value on the master, then we disallow connections on standby until we have replayed up to that xmin on the standby. That way the xmin of the walsender never goes backwards nor do we get cancelations on the standby. Thats easy enough when the standby is initially started but doesn't work that well if just the connection between both failed (or the master was restarted) and a reconnect worked. To make it work similarly in that case we would have to throw everyone out which would kind of counteract the whole idea. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cascading replication: should we detect/prevent cycles?
On Thu, Jan 31, 2013 at 9:48 PM, Josh Berkus j...@agliodbs.com wrote: On 02/01/2013 12:01 PM, Josh Berkus wrote: If we're going to start installing safeguards against doing stupid things, there's a long list of scenarios that happen far more regularly than this ever will and cause far more damage. What's wrong with making it easier for sysadmins to troubleshoot things? Again, I'm not talking about erroring out, I'm talking about logging a warning. Or to put it another way: Robert, you just did a nobody wants that to me. I thought you were opposed to such things on this list. I respectfully disagree. I'm saying that *I* don't want that, which I think is different. To interpret my opposition against saying nobody wants that to mean you can never oppose anything someone else thinks is a good idea would preclude meaningful dialogue on most of what we talk about here. And clearly there is at least some demand for this feature, because you and Craig Ringer both want it. So let me try to restate my objection to this specific feature more clearly. I think that we should be careful about warning the user about things that might not actually be mistakes. I'm not aware that we currently issue ANY warnings of that type. When we emit error messages, we sometimes suggest one possible cause of the error, and such messages are clearly labelled as HINT. But we don't, for example, emit an error or a WARNING or ERROR about a DELETE or UPDATE statement that lacks a WHERE clause, even though many people might like to have such a feature. We don't warn a user hey, float8 is imprecise, consider using numeric or hey, numeric is slow, consider using float8 or setting autovacuum_naptime to an hour is probably dummer than pouring sugar in your gas tank, even though all of those things are true and some people might like to be warned. We only warn or error out when something happens that we are 100% sure is bad. And, in this particular case, it has been suggested that there are legitimate reasons why a replication topology might temporarily involve loops, so I believe this fails that criterion. Second, we have often discussed the importance of avoiding log spam. Warnings that are likely to be repeated a large number of times when they occur have repeatedly been voted down on those grounds. I believe that objection also applies to this case. It is more appropriate to make information about the status of the system available via some status-inquiry function; for example, if you were to recast this as adding a slave-side function that attempts to return the IP of the current master, or NULL if no master, that would answer this objection (but not necessarily all of the other ones). Third, we usually apply a criterion that warnings or errors must represent conditions that we can reliably detect; in other words, we typically do not add checks for situations that we will only sometimes be able to identify. And, in this case, it's a little unclear how we would actually identify loops. Presumably, we'd do it by sending a chain of unique per-node identifiers along with the WAL, and looking for your own identifier in the path, but we don't have any sort of unique per-node identifier right now, and how would you create one? If someone shuts down the cluster, duplicates it, and starts up both copies, we want that to work. Any identifier embedded in the cluster by such a process would be duplicated. You could use something like the node IP and port number, which wouldn't have that pitfall, but as we all know, IPs can be duplicated (e.g. due to NAT) so this isn't necessarily reliable either. If you do come up with a suitable unique per-node identifier, then this is fairly simple to make work for streaming replication, but it's tricky to see how to make it work with archiving. Is that more clear? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On 2013-02-02 11:25:01 -0500, Robert Haas wrote: On Sat, Feb 2, 2013 at 8:41 AM, Andres Freund and...@2ndquadrant.com wrote: - It's probably important to have a formula where we can be sure that the wrap-around term will eventually dominate the dead-tuple term, with enough time to spare to make sure nothing really bad happens; on the other hand, it's also desirable to avoid the case where a table that has just crossed the threshold for wraparound vacuuming doesn't immediately shoot to the top of the list even if it isn't truly urgent. It's unclear to me just from looking at this formula how well the second term meets those goals. I just wanted to mention that if everything goes well, we won't *ever* get to an anti-wraparound-vacuum. Normally the table should cross the vacuum_table_age barrier earlier and promote a normal vacuum to a full-table vacuum which will set relfrozenxid to a new and lower value and thus prevent anti-wraparound vacuums from occurring. So priorizing anti-wraparound vacuums immediately and heavily doesn't seem to be too bad. IMHO, this is hopelessly optimistic. Yes, it's intended to work that way. But INSERT-only or INSERT-mostly tables are far from an uncommon use case; and in fact they're probably the most common cause of pain in this area. You insert a gajillion tuples, and vacuum never kicks off, and then eventually you either update some tuples or hit autovacuum_freeze_max_age and suddenly, BAM, you get this gigantic vacuum that rewrites the entire table. And then you open a support ticket with your preferred PostgreSQL support provider and say something like WTF?. You're right, this doesn't work superbly well, especially for insert-only tables... But imo the place to fix it is not the priorization logic but relation_needs_vacanalyze, since fixing it in priorization won't prevent the BAM just the timing of it. I think scheduling a table for a partial vacuum every min_freeze * 2 xids, even if its insert only, would go a long way of reducing the impact of full-table vacuums. Obviously that would require to retain the last xid a vacuum was executed in... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY FREEZE has no warning
On Sat, Feb 02, 2013 at 10:12:54AM -0500, Bruce Momjian wrote: On Sat, Feb 2, 2013 at 09:51:13AM -0500, Noah Misch wrote: Let's touch on the exception in passing by using the phrase last truncated, giving this wording for both the second and the third COPY FREEZE error sites: cannot perform FREEZE because the table was not created or last truncated in the current subtransaction Well, so you are saying that there really isn't any use-visible logic for those messages to be different, i.e. that the transaction id can be set to invalid even if we created/truncated in the same transaction, but not the same subtransaction? Right. The latest committed code makes sense to me. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On Sat, Feb 2, 2013 at 1:49 PM, Andres Freund and...@2ndquadrant.com wrote: You're right, this doesn't work superbly well, especially for insert-only tables... But imo the place to fix it is not the priorization logic but relation_needs_vacanalyze, since fixing it in priorization won't prevent the BAM just the timing of it. Agreed. I think scheduling a table for a partial vacuum every min_freeze * 2 xids, even if its insert only, would go a long way of reducing the impact of full-table vacuums. Obviously that would require to retain the last xid a vacuum was executed in... I'm not sure that min_freeze * 2 is the right value, but otherwise agreed. I keep coming back to the idea that vacuum should have a high-priority queue and a low-priority queue. When stuff meets the current thresholds, it goes into the high-priority queue. But then there should be a low-priority queue where we do partial vacuums of things that meet some lower threshold - like the unfrozen portions of insert-only tables. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GetOldestXmin going backwards is dangerous after all
On 2 February 2013 18:38, Andres Freund and...@2ndquadrant.com wrote: On 2013-02-02 18:32:44 +, Simon Riggs wrote: On 2 February 2013 14:24, Andres Freund and...@2ndquadrant.com wrote: b) We don't assign the xmin early enough, we only set it when the first feedback message arrives, but we should set it when walsender starts streaming. That's easy to fix. Not trivial, but not too hard, yes. When the standby initially connects we don't yet know which xid will be required because consistency hasn't yet been achieved. We don't need to change the protocol, we just need to issue a hot_standby_feedback message immediately after connection. That looks trivial to me. c) After a disconnect the feedback message will rather likely ask for an xmin horizon thats not valid anymore on the primary. If the disconnect was short enough often enough that doesn't matter because nothing has been cleared out, but it doesn't really work all that well. Thats still better than setting it to the currently valid minimal xmin horizon because it prevents cleanup from that moment on. I don't see how this can be significantly improved without persistent knowledge about standbys. We could delay startup of the standby until the xmin on the standby reaches the xmin on the master. So when the standby has hot_standby_feedback = on, at standby connection we set the xmin of the walsender to be the current value on the master, then we disallow connections on standby until we have replayed up to that xmin on the standby. That way the xmin of the walsender never goes backwards nor do we get cancelations on the standby. Thats easy enough when the standby is initially started but doesn't work that well if just the connection between both failed (or the master was restarted) and a reconnect worked. To make it work similarly in that case we would have to throw everyone out which would kind of counteract the whole idea. Agreed. It was worth considering, at least. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - assign result of query to psql variable
Hello 2013/2/2 Tom Lane t...@sss.pgh.pa.us: Shigeru Hanada shigeru.han...@gmail.com writes: On Sat, Feb 2, 2013 at 7:30 PM, Pavel Stehule pavel.steh...@gmail.com wrote: possible variants a) don't store NULL values - and remove existing variable when NULL be assigned - it is probably best, but should be confusing for users b) implement flag IS NULL - for variables c) use nullPrint d) use empty String +1 for a). If users want to determine whether the result was NULL, or want to use substitute string for NULL result, they can use coalesce in SELECT clause. Anyway the feature should be documented clearly. Yeah, I was considering that one too. Let's do it that way. updated version Regards Pavel regards, tom lane gset_20130202.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result object str handler
On 13-01-07 09:58 PM, Peter Eisentraut wrote: By implementing a str handler for the result object, it now prints something like PLyResult status=5 nrows=2 rows=[{'foo': 1, 'bar': '11'}, {'foo': 2, 'bar': '22'}] Patch attached for review. Here is a review: This patch adds a function that pl/python functions can call to convert a query result hash into a string suitable for debug purposes. The use case for this feature is primarily for debugging and logging purposes. I feel that this is useful since a lot of debugging of stored functions is usually done with print/elog style debugging. There already some discussion on the thread as if the number of rows printed should be limited, the consensus seemed to be 'no' since someone would be unhappy with any limit and printing everything is the same behaviour you get with the standard python print. I've tested this with python2.6 and 3.1 and it seems to work as described. I've looked through the code and everything looks fine. The patch includes no documentation. Adding a few lines to the Utility Functions section of the plpython documentation so people know about this feature would be good. Other than that I think it is fine to commit. I am setting this as ready for committer, I assume you'll commit this yourself and that you can add a paragraph to the docs as you commit it. Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting visibility map in VACUUM's second phase
On Sat, Jan 26, 2013 at 11:25 PM, Pavan Deolasee pavan.deola...@gmail.com wrote: On Thu, Jan 24, 2013 at 9:31 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Thu, Jan 24, 2013 at 1:28 AM, Pavan Deolasee pavan.deola...@gmail.com wrote: Good idea. Even though the cost of pinning/unpinning may not be high with respect to the vacuum cost itself, but it seems to be a good idea because we already do that at other places. Do you have any other review comments on the patch or I'll fix this and send an updated patch soon. That was the only thing that stood out to me. The attached patch gets that improvement. Also rebased on the latest head. Hi Pavan, I get this warning: vacuumlazy.c:890: warning: passing argument 6 of 'lazy_vacuum_page' makes pointer from integer without a cast and make check then fails. I've added '' to that line, and it now passes make check with --enable-cassert. At line 1096, when you release the vmbuffer, you don't set it to InvalidBuffer like the other places in the code do. It seems like this does would lead to a crash or assertion failure, but it does not seem to do so. other places: if (BufferIsValid(vmbuffer)) { ReleaseBuffer(vmbuffer); vmbuffer = InvalidBuffer; } Also, the Note: If you change anything below, also look at should probably say Note: If you change anything in the for loop below, also look at. Otherwise I'd be wondering how far below the caveat applies. I've attached a patch with these changes made. Does this look OK? Thanks, Jeff vacuum-secondphase-setvm-v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On 2013-02-02 14:54:10 -0500, Robert Haas wrote: On Sat, Feb 2, 2013 at 1:49 PM, Andres Freund and...@2ndquadrant.com wrote: I think scheduling a table for a partial vacuum every min_freeze * 2 xids, even if its insert only, would go a long way of reducing the impact of full-table vacuums. Obviously that would require to retain the last xid a vacuum was executed in... I'm not sure that min_freeze * 2 is the right value, but otherwise agreed. Yes, min_freeze * 2 was purely a very first quick guess. My reasoning was basically that values smaller than that won't really do all that much work since barely any of the tuples are old enough and values much bigger won't help much in reducing huge amounts of writes being submitted at the same time and they might miss the window before the next anti-freeze vacuum. I guess something like 5 or so would theoretically be better because it would reduce the number of tuples that are too young for freezing which will be marked all-visible nonetheless and thus not scanned again. But it would not work with the current relevant default settings: vacuum_freeze_table_age = 150,000,000 vacuum_freeze_min_age = 50,000,000 autovacuum_freeze_max_age = 200,000,000 I guess we will have to think about the default for those values. Adhoc I am thinking something like: vacuum_freeze_table_age = 300,000,000 vacuum_freeze_min_age = 20,000,000 autovacuum_freeze_max_age = 800,000,000 and scheduling a vacuum independent from n_dead_tuples every (freeze_table_age - freeze_max_age ) / 5 or so xids. That would mean that approx 4/5 (or more on a busy system) of the tuples would get frozen before a full-table vacuum. I don't the disk size argument for freeze_max_age = 2 is particularly relevant anymore, especially as we would normally keep the size at vacuum_freeze_table_age. To finally fix the issue ISTM that we need an 'age map' to know which parts to scan again and which parts never need to be scanned again, but thats a separate, not too small, feature. I keep coming back to the idea that vacuum should have a high-priority queue and a low-priority queue. When stuff meets the current thresholds, it goes into the high-priority queue. But then there should be a low-priority queue where we do partial vacuums of things that meet some lower threshold - like the unfrozen portions of insert-only tables. I don't think thats the most crucial part atm. Such a queue doesn't solve the problem that we don't want to do unneccesary, repetitive work. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state
On 2013-01-28 16:55:52 -0500, Steve Singer wrote: If your using non-surragate /natural primary keys this tends to come up occasionally due to data-entry errors or renames. I'm looking at this from the point of view of what do I need to use this as a source for a production replication system with fewer sharp-edges compared to trigger source slony. My standard is a bit higher than 'first' version because I intent to use it in the version 3.0 of slony not 1.0. If others feel I'm asking for too much they should speak up, maybe I am. Also the way things will fail if someone were to try and update a primary key value is pretty nasty (it will leave them with inconsistent databases).We could install UPDATE triggers to try and detect this type of thing but I'd rather see us just log the old values so we can use them during replay. I pushed support for this. I am not yet 100% happy with this due to two issues: * it increases the xlog size logged by heap_update by 2 bytes even with wal_level logical as it uses a variant of xl_heap_header that includes its lenght. Conditionally using xl_heap_header would make the code even harder to read. Is that acceptable? * multi_insert should be converted to use xl_heap_header_len as well, instead of using xl_multi_insert_tuple, that would also reduce the amount of multi-insert specific code in decode.c * both for update and delete we should denote more explicitly that -oldtuple points to an index tuple, not to an full tuple Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - assign result of query to psql variable
Pavel Stehule pavel.steh...@gmail.com writes: 2013/2/2 Tom Lane t...@sss.pgh.pa.us: Shigeru Hanada shigeru.han...@gmail.com writes: +1 for a). If users want to determine whether the result was NULL, or want to use substitute string for NULL result, they can use coalesce in SELECT clause. Anyway the feature should be documented clearly. Yeah, I was considering that one too. Let's do it that way. updated version Applied with corrections. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system
On Sat, Jan 5, 2013 at 8:03 PM, Tomas Vondra t...@fuzzy.cz wrote: On 3.1.2013 20:33, Magnus Hagander wrote: Yeah, +1 for a separate directory not in global. OK, I moved the files from global/stat to stat. This has a warning: pgstat.c:5132: warning: 'pgstat_write_statsfile_needed' was used with no prototype before its definition I plan to do some performance testing, but that will take a while so I wanted to post this before I get distracted. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system
Ok, thanks for the info. I'll look into that and I'll also post info from some of our production systems (we've deployed a 9.1-backpatched version about 2 weeks ago). T. Původní zpráva Od: Jeff Janes jeff.ja...@gmail.com Datum: Komu: Tomas Vondra t...@fuzzy.cz Kopie: pgsql-hackers@postgresql.org Předmět: Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system On Sat, Jan 5, 2013 at 8:03 PM, Tomas Vondra t...@fuzzy.cz wrote: On 3.1.2013 20:33, Magnus Hagander wrote: Yeah, +1 for a separate directory not in global. OK, I moved the files from global/stat to stat. This has a warning: pgstat.c:5132: warning: 'pgstat_write_statsfile_needed' was used with no prototype before its definition I plan to do some performance testing, but that will take a while so I wanted to post this before I get distracted. Cheers, Jeff
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On 2013-02-01 15:09:34 -0800, Jeff Janes wrote: As an aside, it does seem like log_autovacuum_min_duration=0 should log whether a scan_all was done, and if so what relfrozenxid got set to. But looking at where the log message is generated, I don't know where to retrieve that info. What about the following, very rough and quick, patch: LOG: automatic vacuum of table postgres.public.data_1: index scans: 1 pages: 2703 removed, 2702 remain, 5405 (100.00%) scanned tuples: 49 removed, 51 remain full-scan: 1, freeze-limit: 28824, new-frozen-xid: 28824 buffer usage: 29957 hits, 2 misses, 4 dirtied avg read rate: 0.020 MB/s, avg write rate: 0.040 MB/s system usage: CPU 0.01s/0.67u sec elapsed 0.77 sec ... LOG: automatic vacuum of table postgres.public.data_1: index scans: 1 pages: 2703 removed, 5404 remain, 5411 (66.74%) scanned tuples: 49 removed, 171 remain full-scan: 0, freeze-limit: 28828, new-frozen-xid: - buffer usage: 34085 hits, 3 misses, 4 dirtied avg read rate: 0.027 MB/s, avg write rate: 0.036 MB/s system usage: CPU 0.01s/0.73u sec elapsed 0.86 sec It obviously needs more polish: - I opted for using the 64bit representation of xids, seems to be better in a log which very well might be looked at only after some wraparounds - exporting 'txid' from adt/txid.c is pretty ugly. I don't like the invention of the type in general, but making it visible outside of txid.c is even uglier, but using both, plain uint64 and txid inside txid.c isn't nice either. - txid_from_xid should be renamed, don't have a good idea to what right now. - is there agreement on the additionally logged information? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On 2013-02-03 02:40:04 +0100, Andres Freund wrote: On 2013-02-01 15:09:34 -0800, Jeff Janes wrote: As an aside, it does seem like log_autovacuum_min_duration=0 should log whether a scan_all was done, and if so what relfrozenxid got set to. But looking at where the log message is generated, I don't know where to retrieve that info. What about the following, very rough and quick, patch: -EINTR -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 5ec65ea..881d8d6 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -304,6 +304,11 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, TimestampDifferenceExceeds(starttime, endtime, Log_autovacuum_min_duration)) { + double scanned_percent; + txid freeze_txid; + txid relfrozen_txid; + char freeze_txid_buf[80]; + char relfrozen_txid_buf[80]; TimestampDifference(starttime, endtime, secs, usecs); read_rate = 0; @@ -315,10 +320,29 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, write_rate = (double) BLCKSZ *VacuumPageDirty / (1024 * 1024) / (secs + usecs / 100.0); } + + scanned_percent = ((double) vacrelstats-scanned_pages / + (vacrelstats-rel_pages + +vacrelstats-pages_removed)) * 100; + + freeze_txid = txid_from_xid(FreezeLimit); + relfrozen_txid = txid_from_xid(new_frozen_xid); + + /* + * print to buffer, so we don't have to include UINT64_FORMAT in a + * translatable string. + */ + snprintf(freeze_txid_buf, sizeof(freeze_txid_buf), + UINT64_FORMAT, freeze_txid); + + snprintf(relfrozen_txid_buf, sizeof(relfrozen_txid_buf), + UINT64_FORMAT, relfrozen_txid); + ereport(LOG, (errmsg(automatic vacuum of table \%s.%s.%s\: index scans: %d\n - pages: %d removed, %d remain\n + pages: %d removed, %d remain, %d (%.02f%%) scanned\n tuples: %.0f removed, %.0f remain\n + full-scan: %d, freeze-limit: %s, new-frozen-xid: %s\n buffer usage: %d hits, %d misses, %d dirtied\n avg read rate: %.3f MB/s, avg write rate: %.3f MB/s\n system usage: %s, @@ -328,8 +352,13 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, vacrelstats-num_index_scans, vacrelstats-pages_removed, vacrelstats-rel_pages, + vacrelstats-scanned_pages, + scanned_percent, vacrelstats-tuples_deleted, vacrelstats-new_rel_tuples, + scan_all, + freeze_txid_buf, + scan_all ? relfrozen_txid_buf : -, VacuumPageHit, VacuumPageMiss, VacuumPageDirty, diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c index de86756..78edec5 100644 --- a/src/backend/utils/adt/txid.c +++ b/src/backend/utils/adt/txid.c @@ -33,9 +33,6 @@ /* txid will be signed int8 in database, so must limit to 63 bits */ #define MAX_TXID UINT64CONST(0x7FFF) -/* Use unsigned variant internally */ -typedef uint64 txid; - /* sprintf format code for uint64 */ #define TXID_FMT UINT64_FORMAT @@ -98,11 +95,17 @@ convert_xid(TransactionId xid, const TxidEpoch *state) if (!TransactionIdIsNormal(xid)) return (txid) xid; - /* xid can be on either side when near wrap-around */ epoch = (uint64) state-epoch; + /* xid is too old to be real */ if (xid state-last_xid + TransactionIdPrecedes(xid, state-last_xid) + epoch == 0) + xid = InvalidTransactionId; + /* xid in last epoch */ + else if (xid state-last_xid TransactionIdPrecedes(xid, state-last_xid)) epoch--; + /* xid after wraparound */ else if (xid state-last_xid TransactionIdFollows(xid, state-last_xid)) epoch++; @@ -317,6 +320,16 @@ bad_format: * communicate with core xid machinery. All the others work on data * returned by them. */ +txid +txid_from_xid(TransactionId xid) +{ + txid val; + TxidEpoch state; + + load_xid_epoch(state); + val = convert_xid(xid, state); + return val; +} /* * txid_current() returns int8 @@ -327,9 +340,6 @@ bad_format: Datum txid_current(PG_FUNCTION_ARGS) { - txid val; - TxidEpoch state; - /* * Must prevent during recovery because if an xid is not assigned we try * to assign one, which would fail. Programs already rely on this function @@ -338,11 +348,7 @@ txid_current(PG_FUNCTION_ARGS) */ PreventCommandDuringRecovery(txid_current()); - load_xid_epoch(state); - - val = convert_xid(GetTopTransactionId(), state); - - PG_RETURN_INT64(val); + return txid_from_xid(GetTopTransactionId()); } /* diff --git a/src/include/access/transam.h b/src/include/access/transam.h index 23a41fd..5d99f4d 100644 --- a/src/include/access/transam.h +++ b/src/include/access/transam.h @@ -127,6 +127,13 @@ typedef struct VariableCacheData typedef VariableCacheData *VariableCache; +/* + * 64bit representation of xids
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On Sat, Feb 2, 2013 at 2:54 PM, Robert Haas robertmh...@gmail.com wrote: On Sat, Feb 2, 2013 at 1:49 PM, Andres Freund and...@2ndquadrant.com wrote: You're right, this doesn't work superbly well, especially for insert-only tables... But imo the place to fix it is not the priorization logic but relation_needs_vacanalyze, since fixing it in priorization won't prevent the BAM just the timing of it. Agreed. I think scheduling a table for a partial vacuum every min_freeze * 2 xids, even if its insert only, would go a long way of reducing the impact of full-table vacuums. Obviously that would require to retain the last xid a vacuum was executed in... I'm not sure that min_freeze * 2 is the right value, but otherwise agreed. I keep coming back to the idea that vacuum should have a high-priority queue and a low-priority queue. When stuff meets the current thresholds, it goes into the high-priority queue. But then there should be a low-priority queue where we do partial vacuums of things that meet some lower threshold - like the unfrozen portions of insert-only tables. When I was thinking about your desire for unitless values, I found myself uncomfortable about that, and I think I've mentioned that. On further reflection, there's good reason. The need to vacuum tables with lots of dead tuples has very different characteristics from the need to vacuum tables to avoid XID rollover. Trying to force them onto the same units seems unlikely to turn out happily. On the other hand, I always thought that there was use for having multiple autovacuum queues, and giving queues different shaped policies, one for each purpose, seems like a mighty fine idea. That way we don't need to worry about mixing the policies. There can be two best policies. I'd go further, and have 3 queues: a) A queue devoted to vacuuming small tables. Anything with more than [some number of relpages] need not apply. b) A queue devoted to vacuuming tables with a lot of dead tuples. c) A queue devoted to vacuuming tables before their XID rollover. The appropriate strength functions for b) and c) can be pretty simple, possibly the relevant bits of the functions that Nasby and I have suggested. And any time b) and c) find small tables, throw them to queue a), essentially doing the quick easy vacuums. -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum not prioritising for-wraparound tables
On 03/02/13 15:08, Christopher Browne wrote: On Sat, Feb 2, 2013 at 2:54 PM, Robert Haas robertmh...@gmail.com wrote: On Sat, Feb 2, 2013 at 1:49 PM, Andres Freund and...@2ndquadrant.com wrote: You're right, this doesn't work superbly well, especially for insert-only tables... But imo the place to fix it is not the priorization logic but relation_needs_vacanalyze, since fixing it in priorization won't prevent the BAM just the timing of it. Agreed. I think scheduling a table for a partial vacuum every min_freeze * 2 xids, even if its insert only, would go a long way of reducing the impact of full-table vacuums. Obviously that would require to retain the last xid a vacuum was executed in... I'm not sure that min_freeze * 2 is the right value, but otherwise agreed. I keep coming back to the idea that vacuum should have a high-priority queue and a low-priority queue. When stuff meets the current thresholds, it goes into the high-priority queue. But then there should be a low-priority queue where we do partial vacuums of things that meet some lower threshold - like the unfrozen portions of insert-only tables. When I was thinking about your desire for unitless values, I found myself uncomfortable about that, and I think I've mentioned that. On further reflection, there's good reason. The need to vacuum tables with lots of dead tuples has very different characteristics from the need to vacuum tables to avoid XID rollover. Trying to force them onto the same units seems unlikely to turn out happily. On the other hand, I always thought that there was use for having multiple autovacuum queues, and giving queues different shaped policies, one for each purpose, seems like a mighty fine idea. That way we don't need to worry about mixing the policies. There can be two best policies. I'd go further, and have 3 queues: a) A queue devoted to vacuuming small tables. Anything with more than [some number of relpages] need not apply. b) A queue devoted to vacuuming tables with a lot of dead tuples. c) A queue devoted to vacuuming tables before their XID rollover. The appropriate strength functions for b) and c) can be pretty simple, possibly the relevant bits of the functions that Nasby and I have suggested. And any time b) and c) find small tables, throw them to queue a), essentially doing the quick easy vacuums. Hmm... Could there be some measure of bloatedness? A table with 10 live rows and a 100 dead tuples should surely have a higher priority of being vacuumed than a table with a 1000 rows and 100 dead tuples? Especially for tables with hundreds of millions of rows! Cheers, Gavin
Re: [HACKERS] [PATCH] pg_isready (was: [WIP] pg_ping utility)
On Tue, Jan 29, 2013 at 11:43 AM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Jan 29, 2013 at 3:12 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Phil Sorber escribió: On Mon, Jan 28, 2013 at 11:20 AM, Fujii Masao masao.fu...@gmail.com wrote: Maybe. But I'm not inclined to add new libpq interface at this stage. Because we are in the last CommitFest and I'm not sure whether we have enough time to implement that. Instead, how about using both PQconninfoParse() and PQconndefaults() like uri-regress.c does? Or just remove that output message? At least I don't think that the information about host and port needs to be output. Which would make the code very simpler. I think that output is important as do others up thread. I think it'd be simpler to just disable dbname's ability to double as conninfo. If we are looking for easy, that is. I don't mind duplicating the behavior from conninfo_array_parse() or uri-regress.c either. I can just put a comment that at some point in the future we should add a libpq interface for it. I suggest duplicate the code for 9.3, and submit a patch to refactor into a new libpq function for CF2013-next. If the patch is simple enough, we can consider putting it into 9.3. Agreed. Regards, -- Fujii Masao OK, here is the patch that handles the connection string in dbname. I'll post the other patch under a different posting because I am sure it will get plenty of debate on it's own. pg_isready_con_str.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq
This patch came up from discussion about pg_isready. PQconninfoParseParams is similar to PQconninfoParse but takes two arrays like PQconnectdbParams. It essentially exposes conninfo_array_parse(). PQconninfodefaultsMerge essentially exposes conninfo_add_defaults(). It allows you to pass a PQconninfoOption struct and it adds defaults for all NULL values. There are no docs yet. I assumed I would let bikeshedding ensue, and also debate on whether we even want these first. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result object str handler
On Sat, 2013-02-02 at 15:43 -0500, Steve Singer wrote: I've looked through the code and everything looks fine. The patch includes no documentation. Adding a few lines to the Utility Functions section of the plpython documentation so people know about this feature would be good. Added some documentation and committed. Thanks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting visibility map in VACUUM's second phase
On Sun, Feb 3, 2013 at 2:31 AM, Jeff Janes jeff.ja...@gmail.com wrote: Hi Pavan, I get this warning: vacuumlazy.c:890: warning: passing argument 6 of 'lazy_vacuum_page' makes pointer from integer without a cast and make check then fails. I've added '' to that line, and it now passes make check with --enable-cassert. Stupid me. Obviously I did not run make check before submitting the patch, but I'm surprised my short pgbench test did not catch this. Thanks a lot for finding and fixing this. At line 1096, when you release the vmbuffer, you don't set it to InvalidBuffer like the other places in the code do. It seems like this does would lead to a crash or assertion failure, but it does not seem to do so. That's harmless because vmbuffer is just a local variable in that function and we are at the end of the function and that variable is not used again. But it helps to just be consistent. So I'm OK with your change. Also, the Note: If you change anything below, also look at should probably say Note: If you change anything in the for loop below, also look at. Otherwise I'd be wondering how far below the caveat applies. Ok. I've attached a patch with these changes made. Does this look OK? Looks good to me. I also repeated pgbench and make check and they work as expected. I'll add it to the CF and also mark the patch ready for committer Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq
On Feb 3, 2013 4:16 AM, Phil Sorber p...@omniti.com wrote: This patch came up from discussion about pg_isready. PQconninfoParseParams is similar to PQconninfoParse but takes two arrays like PQconnectdbParams. It essentially exposes conninfo_array_parse(). PQconninfodefaultsMerge essentially exposes conninfo_add_defaults(). It allows you to pass a PQconninfoOption struct and it adds defaults for all NULL values. There are no docs yet. I assumed I would let bikeshedding ensue, and also debate on whether we even want these first. I think you forgot to attach the patch. /Magnus
Re: [HACKERS] Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
On Saturday, February 02, 2013 9:08 PM Robert Haas wrote: On Fri, Feb 1, 2013 at 12:04 AM, Amit Kapila amit.kap...@huawei.com wrote: I think user should be aware of effect before using SET commands, as these are used at various levels (TRANSACTION, SESSION, ...). Ideally, sure. But these kinds of mistakes are easy to make. You mean to say that after using SET Transaction, user can think below statements will use modified transaction property. But I think if user doesn't understand by default transaction will be committed after the statement execution, he might expect after few statements he can rollback. That's why LOCK and DECLARE CURSOR already emit errors in this case - why should this one be any different? IMO, I think error should be given when it is not possible to execute current statement. Not only LOCK,DECLARE CURSOR but SAVEPOINT and some other statements also give the same error, so if we want to throw error for such behavior, we can find all such similar statements (SET TRANSACTION, SET LOCAL, etc) and do it for all. This can be helpful to some users, but not sure if such behavior (statement can be executed but it will not have any sense) can be always detectable and maintaible. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq
On Sun, Feb 3, 2013 at 1:37 AM, Magnus Hagander mag...@hagander.net wrote: On Feb 3, 2013 4:16 AM, Phil Sorber p...@omniti.com wrote: This patch came up from discussion about pg_isready. PQconninfoParseParams is similar to PQconninfoParse but takes two arrays like PQconnectdbParams. It essentially exposes conninfo_array_parse(). PQconninfodefaultsMerge essentially exposes conninfo_add_defaults(). It allows you to pass a PQconninfoOption struct and it adds defaults for all NULL values. There are no docs yet. I assumed I would let bikeshedding ensue, and also debate on whether we even want these first. I think you forgot to attach the patch. /Magnus /me hangs head in shame :-( Here is is... libpq_params_parse_merge.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers