Re: [GENERAL] Lots of stuck queries after upgrade to 9.4
On 07/30/2015 04:23 PM, Spiros Ioannou wrote: I'm very sorry but we don't have a synthetic load generator for our testing setup, only production and that is on SLA. I would be happy to test the next release though. Ok, no worries. I've pushed this changes, it will appear in the next release. Thanks for the report! - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Lots of stuck queries after upgrade to 9.4
On 07/28/2015 11:36 PM, Heikki Linnakangas wrote: A-ha, I succeeded to reproduce this now on my laptop, with pgbench! It seems to be important to have a very large number of connections: pgbench -n -c400 -j4 -T600 -P5 That got stuck after a few minutes. I'm using commit_delay=100. Now that I have something to work with, I'll investigate this more tomorrow. Ok, it seems that this is caused by the same issue that I found with my synthetic test case, after all. It is possible to get a lockup because of it. For the archives, here's a hopefully easier-to-understand explanation of how the lockup happens. It involves three backends. A and C are insertion WAL records, while B is flushing the WAL with commit_delay. The byte positions 2000, 2100, 2200, and 2300 are offsets within a WAL page. 2000 points to the beginning of the page, while the others are later positions on the same page. WaitToFinish() is an abbreviation for WaitXLogInsertionsToFinish(). "Update pos X" means a call to WALInsertLockUpdateInsertingAt(X). "Reserve A-B" means a call to ReserveXLogInsertLocation, which returned StartPos A and EndPos B. Backend A Backend B Backend C - - - Acquire InsertLock 2 Reserve 2100-2200 Calls WaitToFinish() reservedUpto is 2200 sees that Lock 1 is free Acquire InsertLock 1 Reserve 2200-2300 GetXLogBuffer(2200) page not in cache Update pos 2000 AdvanceXLInsertBuffer() run until about to acquire WALWriteLock GetXLogBuffer(2100) page not in cache Update pos 2000 AdvanceXLInsertBuffer() Acquire WALWriteLock write out old page initialize new page Release WALWriteLock finishes insertion release InsertLock 2 WaitToFinish() continues sees that lock 2 is free. Returns 2200. Acquire WALWriteLock Call WaitToFinish(2200) blocks on Lock 1, whose initializedUpto is 2000. At this point, there is a deadlock between B and C. B is waiting for C to release the lock or update its insertingAt value past 2200, while C is waiting for WALInsertLock, held by B. To fix that, let's fix GetXLogBuffer() to always advertise the exact position, not the beginning of the page (except when inserting the first record on the page, just after the page header, see comments). This fixes the problem for me. I've been running pgbench for about 30 minutes without lockups now, while without the patch it locked up within a couple of minutes. Spiros, can you easily test this patch in your environment? Would be nice to get a confirmation that this fixes the problem for you too. - Heikki diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 8e9754c..307a04c 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1839,11 +1839,32 @@ GetXLogBuffer(XLogRecPtr ptr) endptr = XLogCtl->xlblocks[idx]; if (expectedEndPtr != endptr) { + XLogRecPtr initializedUpto; + /* - * Let others know that we're finished inserting the record up to the - * page boundary. + * Before calling AdvanceXLInsertBuffer(), which can block, let others + * know how far we're finished with inserting the record. + * + * NB: If 'ptr' points to just after the page header, advertise a + * position at the beginning of the page rather than 'ptr' itself. If + * there are no other insertions running, someone might try to flush + * up to our advertised location. If we advertised a position after + * the page header, someone might try to flush the page header, even + * though page might actually not be initialized yet. As the first + * inserter on the page, we are effectively responsible for making + * sure that it's initialized, before we let insertingAt to move past + * the page header. */ - WALInsertLockUpdateInsertingAt(expectedEndPtr - XLOG_BLCKSZ); + if (ptr % XLOG_BLCKSZ == SizeOfXLogShortPHD && + ptr % XLOG_SEG_SIZE > XLOG_BLCKSZ) + initializedUpto = ptr - SizeOfXLogShortPHD; + else if (ptr % XLOG_BLCKSZ == SizeOfXLogLongPHD && + ptr % XLOG_SEG_SIZE < XLOG_BLCKSZ) + initializedUpto = ptr - SizeOfXLogLongPHD; + else + initializedUpto = ptr; + + WALInsertLockUpda
Re: [GENERAL] Lots of stuck queries after upgrade to 9.4
A-ha, I succeeded to reproduce this now on my laptop, with pgbench! It seems to be important to have a very large number of connections: pgbench -n -c400 -j4 -T600 -P5 That got stuck after a few minutes. I'm using commit_delay=100. Now that I have something to work with, I'll investigate this more tomorrow. - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Lots of stuck queries after upgrade to 9.4
On 07/23/2015 02:36 PM, Heikki Linnakangas wrote: On 07/23/2015 11:31 AM, Spiros Ioannou wrote: Well, so far with commit_delay=0 no problems. I will report back of couse if something happens, but I believe that the problem may indeed be solved/masked with that setting. Rough description of our setup, or how to reproduce: * Timeseries data in table , say, "measurements", size: 3-4TB, about 1000 inserts/second * table measurements also has a trigger on insert to also insert on measurements_a (for daily export purposes) Just the above would cause a stuck query after a few days. Now for exporting we run the following CTE query (measurements_b is an empty table, measurements_a has about 5GB) * WITH d_rows AS (DELETE FROM measurement_events_a RETURNING * ) INSERT INTO measurement_events_b SELECT * FROM d_rows; The above caused the problem to appear every time, after a 10-20 minutes. Hmm. With that CTE query, were there other queries running at the same time? I was able to reproduce something like this with pgbench, by running a custom little module that calls the WaitXLogInsertionsToFinish() in a tight loop, and checks that the value it returns moves monotonically forward. With commit_delay on, once every minute or so, it moves backwards. I'll investigate why that happens... I was able to debug the synthetic test case I created, but unfortunately I don't think it explains the lock up you're seeing after all. It's possible for WaitXLogInsertionsToFinish() to move backwards, in this scenario: 1. Backend A acquires WALInsertLock 2, and reserves xlog between byte positions 2100 - 2200 2. Backend B calls WaitXLogInsertionsToFinish(), which blocks on backend A, which hasn't advertised any location yet. 3. Backend C acquires WALInsertLock 1, and reserves xlog between byte positions 2200 - 2300 4. Backend C calls GetXLogBuffer(), and sees that the page is not in cache yet. (It does not call WALInsertLockUpdateInsertingAt() yet, because it's a bit slow or context-switched out) 5. Backend A initializes the page, completes inserting its WAL record, and releases its WALInsertLock. 6. Backend B gets unblocked, seeing that the lock held by B is now free. It calculated 2200 as the return value, which was the latest reserved WAL position. (Backend C started after it began, so it didn't have to wait for it) 7. Backend C calls WALInsertLockUpdateInsertingAt(), with a WAL position pointing to the beginning of the page, 2000. If you now call WALInsertLockUpdateInsertingAt() again, it will return 2000, because backend C is the only backend holding a lock, and its advertised position is 2000. But the previous call calculated 2200. GetXLogBuffer() always advertises a WAL position at the beginning of the requested page, but that's a bit bogus. However, AFAICS that is actually harmless. Backend C is not blocked. The page it's looking for is certainly in cache at this point, so it can continue without blocking. So I don't think this explains your lockup. - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Lots of stuck queries after upgrade to 9.4
On 07/23/2015 11:31 AM, Spiros Ioannou wrote: Well, so far with commit_delay=0 no problems. I will report back of couse if something happens, but I believe that the problem may indeed be solved/masked with that setting. Rough description of our setup, or how to reproduce: * Timeseries data in table , say, "measurements", size: 3-4TB, about 1000 inserts/second * table measurements also has a trigger on insert to also insert on measurements_a (for daily export purposes) Just the above would cause a stuck query after a few days. Now for exporting we run the following CTE query (measurements_b is an empty table, measurements_a has about 5GB) * WITH d_rows AS (DELETE FROM measurement_events_a RETURNING * ) INSERT INTO measurement_events_b SELECT * FROM d_rows; The above caused the problem to appear every time, after a 10-20 minutes. I was able to reproduce something like this with pgbench, by running a custom little module that calls the WaitXLogInsertionsToFinish() in a tight loop, and checks that the value it returns moves monotonically forward. With commit_delay on, once every minute or so, it moves backwards. I'll investigate why that happens... - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Lots of stuck queries after upgrade to 9.4
On 07/20/2015 03:01 PM, Andres Freund wrote: Heikki, On 2015-07-20 13:27:12 +0200, Andres Freund wrote: On 2015-07-20 13:22:42 +0200, Andres Freund wrote: Hm. The problem seems to be the WaitXLogInsertionsToFinish() call in XLogFlush(). These are the relevant stack traces: db9lock/debuglog-commit.txt #2 0x7f7405bd44f4 in LWLockWaitForVar (l=0x7f70f2ab6680, valptr=0x7f70f2ab66a0, oldval=, newval=0x) at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:1011 #3 0x7f7405a0d3e6 in WaitXLogInsertionsToFinish (upto=121713318915952) at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1755 #4 0x7f7405a0e1d3 in XLogFlush (record=121713318911056) at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:2849 db9lock/debuglog-insert-8276.txt #1 0x7f7405b77d91 in PGSemaphoreLock (sema=0x7f73ff6531d0, interruptOK=0 '\000') at pg_sema.c:421 #2 0x7f7405bd4849 in LWLockAcquireCommon (val=, valptr=, mode=, l=) at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:626 #3 LWLockAcquire (l=0x7f70ecaaa1a0, mode=LW_EXCLUSIVE) at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:467 #4 0x7f7405a0dcca in AdvanceXLInsertBuffer (upto=, opportunistic=) at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:2161 #5 0x7f7405a0e301 in GetXLogBuffer (ptr=121713318928384) at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1848 #6 0x7f7405a0e9c9 in CopyXLogRecordToWAL (EndPos=, StartPos=, rdata=0x71c21b90, isLogSwitch=, write_len=) at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1494 #7 XLogInsert (rmid=, info=, rdata=) at /tmp/buildd/postgre XLogFlush() has the following comment: /* * Re-check how far we can now flush the WAL. It's generally not * safe to call WaitXLogInsertionsToFinish while holding * WALWriteLock, because an in-progress insertion might need to * also grab WALWriteLock to make progress. But we know that all * the insertions up to insertpos have already finished, because * that's what the earlier WaitXLogInsertionsToFinish() returned. * We're only calling it again to allow insertpos to be moved * further forward, not to actually wait for anyone. */ insertpos = WaitXLogInsertionsToFinish(insertpos); but I don't think that's valid reasoning. WaitXLogInsertionsToFinish() calls LWLockWaitForVar(oldval = InvalidXLogRecPtr), which will block if there's a exlusive locker and some backend doesn't yet have set initializedUpto. Which seems like a ossible state? A backend always updates its insert position before sleeping/acquiring another lock, by calling WALInsertLockUpdateInsertingAt. So even though another backend might indeed be in the initializedUpto==InvalidXlogRecPtr state, it will get out of that state before either by releasing the lock or updating initializedUpto, before it will in turn do anything that might deadlock. Clearly there's *something* wrong here, though, given the bug report... - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Re: [HACKERS] Check that streaming replica received all data after master shutdown
On 01/13/2015 12:11 PM, Vladimir Borodin wrote: 05 янв. 2015 г., в 18:15, Vladimir Borodin написал(а): Hi all. I have a simple script for planned switchover of PostgreSQL (9.3 and 9.4) master to one of its replicas. This script checks a lot of things before doing it and one of them is that all data from master has been received by replica that is going to be promoted. Right now the check is done like below: On the master: postgres@pgtest03d ~ $ psql -t -A -c 'select pg_current_xlog_location();' 0/3390 postgres@pgtest03d ~ $ /usr/pgsql-9.3/bin/pg_ctl stop -m fast waiting for server to shut down done server stopped postgres@pgtest03d ~ $ /usr/pgsql-9.3/bin/pg_controldata | head pg_control version number:937 Catalog version number: 201306121 Database system identifier: 6061800518091528182 Database cluster state: shut down pg_control last modified: Mon 05 Jan 2015 06:47:57 PM MSK Latest checkpoint location: 0/3428 Prior checkpoint location:0/3328 Latest checkpoint's REDO location:0/3428 Latest checkpoint's REDO WAL file:001B0034 Latest checkpoint's TimeLineID: 27 postgres@pgtest03d ~ $ On the replica (after shutdown of master): postgres@pgtest03g ~ $ psql -t -A -c "select pg_xlog_location_diff(pg_last_xlog_replay_location(), '0/3428');" 104 postgres@pgtest03g ~ $ These 104 bytes seems to be the size of shutdown checkpoint record (as I can understand from pg_xlogdump output). postgres@pgtest03g ~/9.3/data/pg_xlog $ /usr/pgsql-9.3/bin/pg_xlogdump -s 0/3390 -t 27 rmgr: XLOGlen (rec/tot): 0/32, tx: 0, lsn: 0/3390, prev 0/3328, bkp: , desc: xlog switch rmgr: XLOGlen (rec/tot): 72/ 104, tx: 0, lsn: 0/3428, prev 0/3390, bkp: , desc: checkpoint: redo 0/3428; tli 27; prev tli 27; fpw true; xid 0/6010; oid 54128; multi 1; offset 0; oldest xid 1799 in DB 1; oldest multi 1 in DB 1; oldest running xid 0; shutdown pg_xlogdump: FATAL: error in WAL record at 0/3428: record with zero length at 0/3490 postgres@pgtest03g ~/9.3/data/pg_xlog $ I’m not sure that these 104 bytes will always be 104 bytes to have a strict equality while checking. Could it change in the future? Or is there a better way to understand that streaming replica received all data after master shutdown? The check that pg_xlog_location_diff returns 104 bytes seems a bit strange. Don't rely on it being 104 bytes. It can vary across versions, and across different architectures. You could simply check that the standby's pg_last_xlog_replay_location() > master's "Latest checkpoint location", and not care about the exact difference. - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] freeze cannot be finished
On 14.11.2013 02:26, Jeff Janes wrote: On Wed, Nov 13, 2013 at 3:53 PM, Sergey Burladyan wrote: Jeff Janes writes: If I not mistaken, looks like lazy_scan_heap() called from lazy_vacuum_rel() (see [1]) skip pages, even if it run with scan_all == true, lazy_scan_heap() does not increment scanned_pages if lazy_check_needs_freeze() return false, so if this occurred at wraparound vacuum it cannot update pg_class, because pg_class updated via this code: new_frozen_xid = FreezeLimit; if (vacrelstats->scanned_pages < vacrelstats->rel_pages) new_frozen_xid = InvalidTransactionId; vac_update_relstats(onerel, new_rel_pages, new_rel_tuples, new_rel_allvisible, vacrelstats->hasindex, new_frozen_xid); so i think in our prevent wraparound vacuum vacrelstats->scanned_pages always less than vacrelstats->rel_pages and pg_class relfrozenxid never updated. Yeah, I think that that is a bug. If the clean-up lock is unavailable but the page is inspected without it and found not to need freezing, then the page needs to be counted as scanned, but is not so counted. commit bbb6e559c4ea0fb4c346beda76736451dc24eb4e Date: Mon Nov 7 21:39:40 2011 -0500 But this was introduced in 9.2.0, so unless the OP didn't upgrade to 9.2 until recently, I don't know why it just started happening. It looks like a simple fix (to HEAD attached), but I don't know how to test it. Thanks, committed. I was able to reproduce it by doing this: -- Create and populate a test table create table foo (i int4); insert into foo select generate_series(1, 1); -- Freeze it, and observe relfrozenxid. vacuum freeze foo; select relfrozenxid from pg_class where oid='foo'::regclass; -- Do some transactions to bump current transaction ID insert into foo values (-1); insert into foo values (-1); insert into foo values (-1); insert into foo values (-1); -- Now, in a second psql session, open a cursor on the table. It keeps the current page pinned, which causes the ConditionalLockBuffer() in vacuumlazy.c to fail to grab the lock: begin; declare foocur cursor for select * from foo; fetch foocur; -- Back to the original psql session, vacuum freeze again: vacuum freeze foo; -- Observe the new relfrozenxid. With the bug, it's the same as before. Vacuum freeze is not able to advance relfrozenxid because it skipped the page by the cursor. With the patch, it does advance. select relfrozenxid from pg_class where oid='foo'::regclass; Also, it seem like it might be worth issuing a warning if scan_all is true but all was not scanned. Hmm, the new definition of "scanned", with this patch, is that pages that were not vacuumed are still counted as scanned. I don't think a warning is appropriate, there isn't anything wrong with skipping pinned pages that don't need freezing, the amount of bloat left behind should be tiny. For diagnostic purposes, perhaps they should be counted separately in vacuum verbose output, though. - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [pgeu-general] [GENERAL] Call for design: PostgreSQL mugs
Put a small elephant logo *inside* the mug. With a text: "also for embedded systems" - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [pgeu-general] Replication failover
On 23.05.2013 03:55, TJ wrote: We have a few different sets of servers with different versions. 9.0.4 9.1.4 9.2.3 I recently tried to fail-over a set of 9.2.3 servers and server4 did notice the timeline change but did not start following it. We do not have the recovery_target_timeline set in the recovery.conf ATM we are not using WAL archiving only streaming. And server4 is behind server3. So would we need to setup WAL archiving to achieve this? Yep. Set up a WAL archive, set recovery_target_timeline='latest', and use 9.1 or above, and it should work. - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [pgeu-general] Replication failover
BTW, pgeu-general is not for technical questions, so moving to pgsql-general. (I didn't notice the mailing list this came from until after replying). On 22.05.2013 18:22, Heikki Linnakangas wrote: On 22.05.2013 10:23, TJ wrote: I am looking to migrate my databases from one set of hardware to another all the servers are running PGSQL 9.x Which version exactly? There are differences in this area between 9.0, 9.1, and 9.2. If you can choose, use the latest 9.2.X version. The problem i have is that when i fail over to server3 using the trigger file it will increment the timeline which will stop the replication of server4. With 9.1 and above, it will notice the new timeline and start following it, if you set up a WAL archive in addition to streaming replication and set recovery_target_timeline='latest' in recovery.conf. Starting with 9.3, a standby can follow a timeline changes over streaming replication too, so in 9.3 it should just work. That's all assuming that server4 is behind server3; if it has already replayed WAL beyond the point in the WAL where server3 switched to a new timeline, it can't follow that timeline switch. - Heikki -- - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [ODBC] [GENERAL] ODBC constructs
On 21.05.2013 08:11, Dev Kumkar wrote: On Mon, May 20, 2013 at 9:12 PM, Atri Sharma wrote: If you wish to work in C,then,I would suggest libpq.I would wait for more replies on this,as I have little knowledge about psqlODBC. Thanks for the comments. Yes objective is to work in C and found libpq useful but am not sure about psqlODBC. It would be really great to get comments from community here regarding comparison between both of them and also performance perspective. Based on my reading so far libpq is good from performance perspective. libpq is generally-speaking better than psqlodbc. The advantage of ODBC is that if you have a suitable driver, you can connect to other DBMS' as well, while libpq will only work with PostgreSQL. Unless you need such portability, go with libpq. - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [pgeu-general] Alphanumeric natural order sorting
(pgeu-general is not the right list for technical discussions, moving to pgsql-general) On 20.03.2013 10:46, Albe Laurenz wrote: Umashanker, Srividhya wrote: I am looking for a solution the Alphanumeric sorting I am expecting 1, bay1 2, bay2 10, bay10 11, bay11 We are working on a framework, where the client can * call for sort on any colmn. * The digits may or may not be there * The numeric can be anywhere in the string That's easy then. Just define exactly how you want the ordering to be, and based on that definition you can write code for sorting. How would you sort 'bay10', 'ba1y0', 'ba10y', 'ba2y0'? The OP is asking about "natural sort order". See http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html for example. There are a few projects out there for doing that in various programming languages, but I'm not aware of anything for PostgreSQL. Maybe you could pick one of the existing functions listed in that blog post, for example, and write a PL function using them. See also: http://blog.ringerc.id.au/2012/10/natural-sorting-example-of-utility-of.html - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [HACKERS] [GENERAL] Floating point error
On 05.03.2013 15:59, Kevin Grittner wrote: Daniel Farina wrote: This kind of change may have many practical problems that may make it un-pragmatic to alter at this time (considering the workaround is to set the extra float digits), but I can't quite grasp the rationale for "well, the only program that cares about the most precision available is pg_dump". It seems like most programs would care just as much. Something to keep in mind is that when you store 0.01 into a double precision column, the precise value stored, when written in decimal, is: 0.0120816681711721685132943093776702880859375 Of course, some values can't be precisely written in decimal with so few digits. It would be nice to have a base-2 text format to represent floats. It wouldn't be as human-friendly as base-10, but it could be used when you don't want to lose precision. pg_dump in particular. - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Re: [COMMITTERS] pgsql: Allow a streaming replication standby to follow a timeline switc
On 18.12.2012 13:42, hubert depesz lubaczewski wrote: In pg_log on ubuntu2 I see: 2012-12-18 12:41:34.428 CET [unknown]@[unknown] 1685 LOG: connection received: host=172.28.173.142 port=45842 2012-12-18 12:41:34.430 CET replication@[unknown] 1685 172.28.173.142(45842) LOG: replication connection authorized: user=replication 2012-12-18 12:41:34.432 CET replication@[unknown] 1685 172.28.173.142(45842) ERROR: requested WAL segment 00020015 has already been removed 2012-12-18 12:41:34.433 CET replication@[unknown] 1685 172.28.173.142(45842) LOG: disconnection: session time: 0:00:00.005 user=replication database= host=172.28.173.142 port=45842 Something looks weird. To put it lightly. Hmm, that's a different error than you got before. Thom also reported a "requested WAL segment ... has already been removed" error, but in his test case, and as far as I could reproduce it, the error doesn't reoccur when the standby reconnects. In other words, it eventually worked despite that error. In any case, I just committed a fix for the scenario that Thom reported. Can you try again with a fresh checkout? - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Re: [COMMITTERS] pgsql: Allow a streaming replication standby to follow a timeline switc
On 15.12.2012 17:06, hubert depesz lubaczewski wrote: I might be missing something, but what exactly does that commit give us? I mean - we were able, previously, to make slave switch to new master - as Phil Sorber described in here: http://philsorber.blogspot.com/2012/03/what-to-do-when-your-timeline-isnt.html After some talk on IRC, I understood that this patch will make it possible to switch to new master in plain SR replication, with no WAL archive (because if you have wal archive, you can use the method Phil described, which basically "just works"). Right, that's exactly the point of the patch. A WAL archive is no longer necessary for failover. So I did setup three machines: master and two slaves. Master had 2 IPs - its own, and a floating one. Both slaves were connecting to the floating one, and recovery.conf looked like: - standby_mode = 'on' primary_conninfo = 'port=5920 user=replication host=172.28.173.253' trigger_file = '/tmp/finish.replication' recovery_target_timeline='latest' - After I verified that replication works to both slaves, I did failover one of the slaves, shut down master, and did ip takeover of floating ip to the slave that did takeover. Hmm, is it possible that some WAL was generated in the old master, and streamed to the standby, after the new master was already promoted? It's important to kill the old master before promoting the new master. Otherwise the timelines diverge, so that you have some WAL on the old timeline that's not present in the new master, and some WAL in the new master's timeline that's not present in the old master. In that situation, if the standby has already replicated the WAL from the old master, it can no longer start to follow the new master. I think that would match the symptoms you're seeing. I wouldn't rule out a bug in the patch either, though. Amit found a worrying number of bugs in his testing, and although we stamped out all the known bugs, it wouldn't surprise me if there's more :-(.. - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [BUGS] Prepared Statement Name Truncation
On 23.11.2012 17:53, Tom Lane wrote: Euler Taveira writes: On 22-11-2012 04:27, Pavel Stehule wrote: significantly larger catalog Less than 5% of catalog columns? I don't buy your argument. It's not about count, it's about size. For instance, pg_attribute currently requires 140 bytes per row (counting the tuple header and line pointer), so adding 64 bytes would represent 45% bloat. In a database with lots of tables that would be painful. We could avoid this problem if we were prepared to make type "name" be varlena, ... It would actually be nice to do that because it would *reduce* the amount of space and memory used for the catalogs in the typical case, where the attribute names are much smaller than 64 bytes. I received a complaint just the other day that our backend processes consume a lot of memory, even when idle; the catalog caches are a large part of that. - Heikki -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Anyone using silent_mode?
Is anyone using silent_mode=on ? There's some discussion on pgsql-hackers [1] on whether it should be removed altogether in 9.2, since you can get the same functionality with "pg_ctl start", or nohup. If you're using silent_mode, what are you using it for? [1] http://archives.postgresql.org/message-id/1308926157-sup-7...@alvh.no-ip.org -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [JDBC] How to just "link" to some data feed
Albretch Mueller wrote: import/export the data into/out of PG, so you will be essentially duplicating the data and having to synch it. This is exactly what I am trying to avoid, I would like for PG to handle the data right from the data feed You could write a set-returning function that reads the file, and perhaps a view on top of that. As the file is read on every invocation, that's only practical for small tables, though. It's likely a better idea to just COPY the table into the database periodically, or perhaps write an external script that watches the modification timestamp on the file and triggers a reload whenever it changes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [HACKERS] Kill a Long Running Query
Please don't cross-post to multiple mailing lists. And pgsql-hackers is not the correct list for basic usage questions. And long end-of-mail disclaimers are not generally appreciated. Mageshwaran wrote: Any body tell me how to kill a long running query in postgresql, is there any statement to kill a query, See the user manual on administration functions, pg_cancel_backend in particular: http://www.postgresql.org/docs/8.2/interactive/functions-admin.html Basically you issue a "SELECT * FROM pg_stat_activity", or plain ps to find out the pid of the backend executing the long running query, and then use pg_cancel_backend (or kill -INT) to cancel it. and also tell me how to log slow queries to a log file. Using the log_min_duration_statement configuration variable. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Kenneth Marshall wrote: On Mon, Jan 22, 2007 at 06:42:09PM +, Simon Riggs wrote: Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread... There may be other functions that could leverage a similar sort of infrastructure. For example, a long DB mining query could be registered with the system. Then as the pieces of the table/database are brought in to shared memory during the normal daily DB activity they can be acquired without forcing the DB to run a very I/O expensive query when waiting a bit for the results would be acceptable. As long as we are thinking piggyback. Yeah, I had the same idea when we discussed synchronizing sequential scans. The biggest difference is that with queries, there's often a user waiting for the query to finish, but with vacuum we don't care so much how long it takes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Gregory Stark wrote: "Bruce Momjian" <[EMAIL PROTECTED]> writes: I agree it index cleanup isn't > 50% of vacuum. I was trying to figure out how small, and it seems about 15% of the total table, which means if we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps 80%, assuming 5% of the table is scanned. Actually no. A while back I did experiments to see how fast reading a file sequentially was compared to reading the same file sequentially but skipping x% of the blocks randomly. The results were surprising (to me) and depressing. The breakeven point was about 7%. Note that with uniformly random updates, you have dirtied every page of the table until you get anywhere near 5% of dead space. So we have to assume non-uniform distribution of update for the DSM to be of any help. And if we assume non-uniform distribution, it's a good bet that the blocks that need vacuuming are also not randomly distributed. In fact, they might very well all be in one cluster, so that scanning that cluster is indeed sequential I/O. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Bruce Momjian wrote: Heikki Linnakangas wrote: Russell Smith wrote: 2. Index cleanup is the most expensive part of vacuum. So doing a partial vacuum actually means more I/O as you have to do index cleanup more often. I don't think that's usually the case. Index(es) are typically only a fraction of the size of the table, and since 8.2 we do index vacuums in a single scan in physical order. In fact, in many applications the index is be mostly cached and the index scan doesn't generate any I/O at all. Are _all_ the indexes cached? I would doubt that. Well, depends on your schema, of course. In many applications, yes. Also, for typical table, what percentage is the size of all indexes combined? Well, there's no such thing as a typical table. As an anecdote here's the ratios (total size of all indexes of a table)/(size of corresponding heap) for the bigger tables for a DBT-2 run I have at hand: Stock: 1190470/68550 = 6% Order_line: 950103/274372 = 29% Customer: 629011 /(5711+20567) = 8% In any case, for the statement "Index cleanup is the most expensive part of vacuum" to be true, you're indexes would have to take up 2x as much space as the heap, since the heap is scanned twice. I'm sure there's databases like that out there, but I don't think it's the common case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Russell Smith wrote: 2. Index cleanup is the most expensive part of vacuum. So doing a partial vacuum actually means more I/O as you have to do index cleanup more often. I don't think that's usually the case. Index(es) are typically only a fraction of the size of the table, and since 8.2 we do index vacuums in a single scan in physical order. In fact, in many applications the index is be mostly cached and the index scan doesn't generate any I/O at all. I believe the heap scans are the biggest issue at the moment. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Russell Smith wrote: Strange idea that I haven't researched, Given Vacuum can't be run in a transaction, it is possible at a certain point to quit the current transaction and start another one. There has been much chat and now a TODO item about allowing multiple vacuums to not starve small tables. But if a big table has a long running vacuum the vacuum of the small table won't be effective anyway will it? If vacuum of a big table was done in multiple transactions you could reduce the effect of long running vacuum. I'm not sure how this effects the rest of the system thought. That was fixed by Hannu Krosing's patch in 8.2 that made vacuum to ignore other vacuums in the oldest xmin calculation. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [GENERAL] ISO week dates
Guillaume Lelarge wrote: I've tried to work on the first one, the ISO day field. My patch is attached and is against CVS HEAD. It only takes care of the code, nothing is done for documentation matter. It works with me : I haven't been following this thread, but I just wanted to point out that we prefer context diffs. Please resend the patch as a context diff, using "diff -c" or "cvs diff -c", so that it's easier to review. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend