Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests
On Mon, Jul 25, 2016 at 2:38 PM, Alvaro Herrerawrote: > Michael Paquier wrote: > Yeah, thanks, pushed. However this doesn't explain all the failures we see: I missed those ones, thanks for the reminder. > 1) In > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster=2016-07-14%2016%3A00%3A06 > we see the pg_basebackup test failing. I suppose that failure is also > because of slowness, though of course this patch won't fix it. That's from 010_pg_basebackup... And I am not sure what's behind that. Could it be possible to add --verbose to the commands of pg_basebackup? We may be able to catch the problem if it shows up again. > 2) In > http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hamster=2016-06-29%2016%3A00%3A06=recovery-check > we see a completely different failure: > > error running SQL: 'psql::1: ERROR: relation "tab_int" does not exist > LINE 1: SELECT count(*) FROM tab_int > ^' > while running 'psql -XAtq -d port=52824 host=/tmp/or2xHglniM dbname=postgres > -f - -v ON_ERROR_STOP=1' at > /home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/recovery/../../../src/test/perl/PostgresNode.pm > line 1166. > > Do we have an explanation for this one? Ah, yes, and that's a stupid mistake. We had better use replay_location instead of write_location. There is a risk that records have not been replayed yet even if they have been written on the standby, so it is possible that the query looking at tab_int may not see this relation. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests
Michael Paquier wrote: > Lately hamster is failing every 4/5 days on the recovery regression > tests in 003 covering the recovery targets, with that: > # Postmaster PID for node "standby_2" is 20510 > # > Timed out while waiting for standby to catch up at > t/003_recovery_targets.pl line 36. > > Which means that poll_for_query timed out for the standby to catch up.. > > Here is an example of test that failed: > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster=2016-07-24%2016%3A00%3A07 Yeah, thanks, pushed. However this doesn't explain all the failures we see: 1) In http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster=2016-07-14%2016%3A00%3A06 we see the pg_basebackup test failing. I suppose that failure is also because of slowness, though of course this patch won't fix it. 2) In http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hamster=2016-06-29%2016%3A00%3A06=recovery-check we see a completely different failure: error running SQL: 'psql::1: ERROR: relation "tab_int" does not exist LINE 1: SELECT count(*) FROM tab_int ^' while running 'psql -XAtq -d port=52824 host=/tmp/or2xHglniM dbname=postgres -f - -v ON_ERROR_STOP=1' at /home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/recovery/../../../src/test/perl/PostgresNode.pm line 1166. Do we have an explanation for this one? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem in PostgresSQL Configuration with YII 1 & Wordpress
On 24 July 2016 at 02:20, Jyoti Sharmawrote: > Hi Team, > > Currently we have a project running with MySQL with YII & Wordpress, Now i > want to change the Database from MYSQL to PostgreSQL. > Hi. The pgsql-hackers mailing list is for development of PostgreSQL its self. General questions are better suited to the pgsql-general mailing list or to Stack Overflow. Conversion from MySQL to PostgreSQL is a common question and there's lots of info out there, though you might like the answers since there are't any automagic tools to make all existing queries "just work". As for junk characters, look at the "bytea" data type. For more information please post at one of the more appropriate locations given above; I will not reply to responses to this mail on pgsql-hackers. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
[HACKERS] Increasing timeout of poll_query_until for TAP tests
Hi all, Lately hamster is failing every 4/5 days on the recovery regression tests in 003 covering the recovery targets, with that: # Postmaster PID for node "standby_2" is 20510 # Timed out while waiting for standby to catch up at t/003_recovery_targets.pl line 36. Which means that poll_for_query timed out for the standby to catch up.. Here is an example of test that failed: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster=2016-07-24%2016%3A00%3A07 This buildfarm machine is legendary known for its slowness, and I don't see a better answer to that than increasing the max timeout of poll_query_until to put that back to green. Thoughts? -- Michael fix-recovery-tap-failures.patch Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Constraint merge and not valid status
Hello, At Fri, 22 Jul 2016 17:35:48 +0900, Amit Langotewrote in <9733fae3-c32f-b150-e368-a8f87d546...@lab.ntt.co.jp> > On 2016/07/22 17:06, Kyotaro HORIGUCHI wrote: > > At Fri, 22 Jul 2016 14:10:48 +0900, Amit Langote wrote: > >> On 2016/07/22 0:38, Robert Haas wrote: > >>> On Wed, Jul 13, 2016 at 5:22 AM, Amit Langote wrote: > Consider a scenario where one adds a *valid* constraint on a inheritance > parent which is then merged with a child table's *not valid* constraint > during inheritance recursion. If merged, the constraint is not checked > for the child data even though it may have some. Is that an oversight? > >>> > >>> Seems like it. I'd recommend we just error out in that case and tell > >>> the user that they should validate the child's constraint first. > >> > >> Agreed. > >> > >> Patch attached. In addition to the recursion from parent case, this seems > >> to be broken for the alter table child inherit parent case as well. So, > >> fixed both MergeWithExistingConstraint (called from > >> AddRelationNewConstraints) and MergeConstraintsIntoExisting (called from > >> ATExecAddInherit). I had to add a new argument is_not_valid to the former > >> to signal whether the constraint being propagated itself is declared NOT > >> VALID, in which we can proceed with merging. Also added some tests for > >> both cases. > > > > It seems to work as expected and message seems to be > > reasonable. Test seems to be fine. > > Thanks for reviewing. > > > By the way I have one question. > > > > Is it an expected configuration where tables in an inheritance > > tree has different valid state on the same (check) constraint? > > I would think not. I understand that the first problem is that the problematic state inhibits later VALIDATE CONSTRAINT on parent from working as expected. This patch inhibits the state where a parent is valid and any of its children is not-valid, but allows the opposite and it is enough to fix the problem. I thought the opposite state is ok generally but not with full confidence. After some reading the code, it seems to affect only on some cache invalidation logics and constraint exclusion logic to ignore the check constraint per component table, and acquire_inherited_sample_rows. The first and second wouldn't harm. The third causes needless tuple conversion. If this is a problem, the validity state of all relations in an inheritance tree should be exactly the same, ether valid or not-valid. Or should make the function to ignore the difference of validity state. If the problem is only VALIDATE CONSTRAINT on the parent and mixted validity states within an inheritance tree is not, making it process whole the inheritance tree regardsless of the validity state of the parent would also fix the problem. After all, my concerns are the following. - Is the mixed validity states (in any form) in an inheritnce tree should be valid? If so, VALIDATE CONSTRAINT should be fixed, not MergeWithExistingConstraint. If not, the opposite state also should be inhibited. - Is it valid to represent all descendants' validity states by the parent's state? (Currently only VALIDATE CONSTRAINT does) If not, VALIDATE CONSTRAINT should be fixed. Any thoughts? > > The check should be an equality if it's not. > > If you mean that the valid state should be same (equal) at all times on > parent and all the child tables, then that is exactly what the patch tries > to achieve. Currently, valid state of a constraint on a child table is > left to differ from the parent in two cases as described in my messages. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LWLocks in DSM memory
On Mon, Jul 25, 2016 at 3:02 PM, Thomas Munrowrote: > I measured the following times for unpatched master, on my 4 core laptop: > > 16 workers = 73.067s, 74.869s, 75.338s > 8 workers = 65.846s, 67.622s, 68.039s > 4 workers = 68.763s, 68.980s, 69.035s <-- curiously slower here > 3 workers = 59.701s, 59.991s, 60.133s > 2 workers = 53.620s, 55.300s, 55.790s > 1 worker = 21.578s, 21.535s, 21.598s > > With the attached patched I got: > > 16 workers = 75.341s, 77.445s, 77.635s <- +3.4% > 8 workers = 67.462s, 68.622s, 68.851s <- +1.4% > 4 workers = 64.983s, 65.021s, 65.496s <- -5.7% > 3 workers = 60.247s, 60.425s, 60.492s <- +0.7% > 2 workers = 57.544s, 57.626s, 58.133s <- +2.3% > 1 worker = 21.403s, 21.486s, 21.661s <- -0.2% Correction, that +2.3% for 2 workers should be +4.2%. And to clarify, I ran the test 3 times as shown and those percentage changes are based on the middle times. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LWLocks in DSM memory
On Sun, Jul 24, 2016 at 1:10 AM, Thomas Munrowrote: > One solution could be to provide a non-circular variant of the dlist > interface that uses NULL list termination. I've attached a quick > sketch of something like that which seems to work correctly. It is > only lightly tested so far and probably buggy, but shows the general > idea. On reflection, it wouldn't make much sense to put "noncircular" in the names of interfaces, as that is an internal detail. Maybe "relocatable" or "position independent" or something like that, since that's a user-visible property of the dlist_head. Here is a version of the patch that uses _relocatable. > Any thoughts on this approach, or better ways to solve this problem? > How valuable is the branch-free property of the circular push and > delete operations? I investigated this a bit. If I do this on my laptop (clang, no asserts, -O2), it takes 3895 milliseconds, or 4.8ns per push/delete operation: dlist_init(); for (i = 0; i < 1; ++i) { dlist_push_head(, [0]); dlist_push_tail(, [1]); dlist_push_head(, [2]); dlist_push_tail(, [3]); dlist_delete([2]); dlist_delete([3]); dlist_delete([0]); dlist_delete([1]); } The relocatable version takes 5907 milliseconds, or 7.4ns per push/delete operation: dlist_init_relocatable(); for (i = 0; i < 1; ++i) { dlist_push_head_relocatable(, [0]); dlist_push_tail_relocatable(, [1]); dlist_push_head_relocatable(, [2]); dlist_push_tail_relocatable(, [3]); dlist_delete_relocatable(, [2]); dlist_delete_relocatable(, [3]); dlist_delete_relocatable(, [0]); dlist_delete_relocatable(, [1]); } Those operations are ~50% slower. So getting rid of dlist's clever branch-free code generally seems like a bad idea. Next I wondered if it would be a bad idea to use slower relocatable dlist heads for all LWLocks. Obviously LWLocks are used all over the place and it would be bad to slow them down, but I wondered if maybe dlist manipulation might not be a very significant part of their work. So I put a LWLock and a counter in shared memory, and had N background workers run a tight loop that locks, increments and unlocks concurrently until the counter reaches 1 billion. This creates different degrees of contention and wait list sizes. The worker loop looks like this: while (!finished) { LWLockAcquire(>lock, LW_EXCLUSIVE); ++control->counter; if (control->counter >= control->goal) finished = true; LWLockRelease(>lock); } I measured the following times for unpatched master, on my 4 core laptop: 16 workers = 73.067s, 74.869s, 75.338s 8 workers = 65.846s, 67.622s, 68.039s 4 workers = 68.763s, 68.980s, 69.035s <-- curiously slower here 3 workers = 59.701s, 59.991s, 60.133s 2 workers = 53.620s, 55.300s, 55.790s 1 worker = 21.578s, 21.535s, 21.598s With the attached patched I got: 16 workers = 75.341s, 77.445s, 77.635s <- +3.4% 8 workers = 67.462s, 68.622s, 68.851s <- +1.4% 4 workers = 64.983s, 65.021s, 65.496s <- -5.7% 3 workers = 60.247s, 60.425s, 60.492s <- +0.7% 2 workers = 57.544s, 57.626s, 58.133s <- +2.3% 1 worker = 21.403s, 21.486s, 21.661s <- -0.2% Somewhat noisy data and different effects at different numbers of workers. I can post the source for those tests if anyone is interested. If you have any other ideas for access patterns to test, or clever ways to keep push and delete branch-free while also avoiding internal pointers back to dlist_head, I'm all ears. Otherwise, if a change affecting all LWLocks turns out to be unacceptable, maybe we would need to have a different LWLock interface for relocatable LWLocks to make them suitable for use in DSM segments. Any thoughts? -- Thomas Munro http://www.enterprisedb.com lwlocks-in-dsm-v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Andrew Borodin wrote: > >If a feature changes the shape of WAL records, XLOG_PAGE_MAGIC is bumped to > >prevent any problems. > I've attached patch with a bump, but, obviously, it'll be irrelevant > after any other commit changing WAL shape. > > Here is a patch with updated GiST README. > I haven't found apropriate place to describe PageIndexTupleOverwrite > function in docs, since all it's siblings are described only in the > code. > Code comments describing this function are coherent with others > (PageAddItem, PageIndexTupleDelete). Can you please patch BRIN to use this new function too? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Michael Paquierwrites: > On Sun, Jul 24, 2016 at 7:18 PM, Andrew Borodin wrote: >> I've attached patch with a bump, but, obviously, it'll be irrelevant >> after any other commit changing WAL shape. > Usually the committer in charge of reviewing such a patch would bump > it. There is no need for the patch submitter to do so. I should have > been more precise previously, sorry for my twisted words. It's good to remind the committer that such a bump is needed, of course. But yeah, casting the reminder in the form of a hunk of the patch is more likely to cause trouble than be helpful. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Amit Kapilawrites: > On Fri, Jul 22, 2016 at 4:32 AM, Tom Lane wrote: >> The problem is that exec_stmt_dynexecute() loses control to the error >> thrown from the bogus command, and therefore leaks its "querystr" local >> variable --- which is not much of a leak, but it adds up if the loop >> iterates enough times. There are similar problems in many other places in >> plpgsql. Basically the issue is that while running a plpgsql function, >> CurrentMemoryContext points to a function-lifespan context (the same as >> the SPI procCxt the function is using). We also store things such as >> values of the function's variables there, so just resetting that context >> is not an option. plpgsql does have an expression-evaluation-lifespan >> context for short-lived stuff, but anything that needs to live for more >> or less the duration of a statement is put into the procedure-lifespan >> context, where it risks becoming a leak. > Wouldn't it be better, if each nested sub-block (which is having an > exception) has a separate "SPI Proc", "SPI Exec" contexts which would > be destroyed at sub-block end (either commit or rollback)? AFAICS that would just confuse matters. In the first place, plpgsql variable values are not subtransaction-local, so they'd have to live in the outermost SPI Proc context anyway. In the second place, spi.c contains a whole lot of assumptions that actions like saving a plan are tied to the current SPI Proc/Exec contexts, so SPI-using plpgsql statements that were nested inside a BEGIN/EXCEPT would probably break: state they expect to remain valid from one execution to the next would disappear. > In short, why do you think it is better to create a new context rather > than using "SPI Exec"? "SPI Exec" has the same problem as the eval_econtext: there are already points at which it will be reset, and those can't necessarily be delayed till end of statement. In particular, _SPI_end_call will delete whatever is in that context. Also, spi.c does not consider the execCxt to be an exported part of its abstraction, and I'm pretty loath to punch another hole in that API. Also, as I've been working through this, I've found that only a rather small minority of plpgsql statements actually need statement-lifetime storage. So I'm thinking that it will be faster to create such a context only on-demand, not unconditionally; which knocks out any thought of changing plpgsql's coding conventions so much that statement-lifespan storage would become the normal place for CurrentMemoryContext to point. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On Sun, Jul 24, 2016 at 12:40 PM, Amit Kapilawrote: > In short, why do you think > it is better to create a new context rather than using "SPI Exec"? > I think life span of the memory allocated from "SPI Exec" is only within "Executor", and after that SPI_Exec will be reset. But many places we need such memory beyond "Executor"(including one which is reported in above issue). If we see below example of exec_stmt_dynexecute. exec_stmt_dynexecute { querystr = pstrdup(querystr); SPI_Execute --> inside this SPI_Exec context will be reset. After this querystr is being used in this function. } -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
On Sun, Jul 24, 2016 at 7:18 PM, Andrew Borodinwrote: >>If a feature changes the shape of WAL records, XLOG_PAGE_MAGIC is bumped to >>prevent any problems. > > I've attached patch with a bump, but, obviously, it'll be irrelevant > after any other commit changing WAL shape. Usually the committer in charge of reviewing such a patch would bump it. There is no need for the patch submitter to do so. I should have been more precise previously, sorry for my twisted words. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
>If a feature changes the shape of WAL records, XLOG_PAGE_MAGIC is bumped to >prevent any problems. I've attached patch with a bump, but, obviously, it'll be irrelevant after any other commit changing WAL shape. Here is a patch with updated GiST README. I haven't found apropriate place to describe PageIndexTupleOverwrite function in docs, since all it's siblings are described only in the code. Code comments describing this function are coherent with others (PageAddItem, PageIndexTupleDelete). Best regards, Andrey Borodin, Octonica & Ural Federal University. XLog-magic-bump.patch Description: Binary data PageIndexTupleOverwrite v5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On Fri, Jul 22, 2016 at 4:32 AM, Tom Lanewrote: > In > https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com > it was pointed out that you could get an intra-function-call memory leak > from something like > > LOOP > BEGIN > EXECUTE 'bogus command'; > EXCEPTION WHEN OTHERS THEN > END; > END LOOP; > > The problem is that exec_stmt_dynexecute() loses control to the error > thrown from the bogus command, and therefore leaks its "querystr" local > variable --- which is not much of a leak, but it adds up if the loop > iterates enough times. There are similar problems in many other places in > plpgsql. Basically the issue is that while running a plpgsql function, > CurrentMemoryContext points to a function-lifespan context (the same as > the SPI procCxt the function is using). We also store things such as > values of the function's variables there, so just resetting that context > is not an option. plpgsql does have an expression-evaluation-lifespan > context for short-lived stuff, but anything that needs to live for more > or less the duration of a statement is put into the procedure-lifespan > context, where it risks becoming a leak. (That design was fine > originally, because any error would result in abandoning function > execution and thereby cleaning up that context. But once we invented > plpgsql exception blocks, it's not so good.) > Wouldn't it be better, if each nested sub-block (which is having an exception) has a separate "SPI Proc", "SPI Exec" contexts which would be destroyed at sub-block end (either commit or rollback)? I think one difficulty could be that we need some way to propagate the information that is required by outer blocks, if there exists any such information. > One way we could resolve the problem is to require all plpgsql code to > use PG_TRY/PG_CATCH blocks to ensure that statement-lifespan variables > are explicitly released. That's undesirable on pretty much every front > though: it'd be notationally ugly, prone to omissions, and not very > speedy. > > Another answer is to invent a third per-function memory context intended > to hold statement-lifespan variables. > This sounds better than spreading PG_TRY/PG_CATCH everywhere. I think if this allocation would have been done in executor context "SPI Exec", then it wouldn't have leaked. One way could have been that by default all exec_stmt* functions execute in "SPI Exec" context and we then switch to "SPI Proc" for the memory that is required for longer duration. I think that might not be good, if we have to switch at many places, but OTOH the same will be required for a new statement-level execution context as well. In short, why do you think it is better to create a new context rather than using "SPI Exec"? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers