Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Alvaro Herrerawrites: > ... However, when you create an index, you can > indicate which operator class to use, and it may not be the default one. > If a different one is chosen at index creation time, then a query using > COUNT(distinct) will do the wrong thing, because DISTINCT will select > an equality type using the type's default operator class, not the > equality that belongs to the operator class used to create the index. > That's wrong: DISTINCT should use the equality operator that corresponds > to the index' operator class instead, not the default one. Uh, what? Surely the semantics of count(distinct x) *must not* vary depending on what indexes happen to be available. I think what you meant to say is that the planner may only choose an optimization of this sort when the index's opclass matches the one DISTINCT will use, ie the default for the data type. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
I wrote: > So the question is, does anyone care? I wouldn't except that our > documentation appears to claim that we work with Perl "5.8 or later". > And the lack of field complaints suggests strongly that nobody else > cares. So I'm inclined to think we just need to be more specific > about the minimum Perl version --- but what exactly? I've done some more research on this. It seems to me there are four distinct components to any claim about whether we work with a particular Perl version: 1. Can it run the build-related Perl scripts needed to build PG from a bare git checkout, on non-Windows platforms? 2. Can it run our MSVC build scripts? 3. Does pl/perl compile (and pass its regression tests) against it? 4. Can it run our TAP tests? I have no info to offer about point #2, but I did some tests with ancient 5.8.x Perl versions on my buildfarm hosts prairiedog and gaur. (I would have liked to use something faster, but these Perl versions failed to build at all on more recent platforms, and I thought it rather pointless to try to hack them enough to build.) I find that perl 5.8.0 and later seem to succeed at point #1, but you need at least 5.8.1 to compile plperl. Also, on prairiedog 5.8.1 fails plperl's regression tests because of some seemingly utf8-related bug. That may be an ancient-macOS problem more than anything else, so I didn't poke at it too hard. The really surprising thing I found out is that you can't run the TAP tests on anything older than 5.8.3, because that's when they added "prove". I'm bemused by the idea that such a fundamental facility would get added in a third-digit minor release, but there you have it. (Now, in fairness, this amounted to importing a feature that already existed on CPAN, but still...) 5.8.3 does appear to succeed at points 1,3,4. Now, to get through the TAP tests you need to install IPC::Run, and you also need to update Test::More because the version shipped with 5.8.3 is too old. But at least the failures that you get from lacking these are pretty clear. I am unable to confirm our claim that we work with Test::More 0.82, because CPAN has only versions from a year or three back. (Anyone know of a more, er, comprehensive archive than CPAN?) It's also interesting to speculate about how old a version of IPC::Run is new enough, but I see no easy way to get much data on that either. Anyway, pending some news about compatibility of the MSVC scripts, I think we ought to adjust our docs to state that 5.8.3 is the minimum supported Perl version. Also, I got seriously confused at one point during these tests because, while our configure script carefully sets PERL to an absolute path name, it's content to set PROVE to "prove", paying no attention to whether that version of "prove" matches "perl". Is it really okay to run the TAP tests with a different perl version than we selected for other purposes? I think it'd be a good idea to insist that "prove" be in the same directory we found "perl" in. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for Default partition in partitioning
Hi Ashutosh, 0003 patch > +parentRel = heap_open(parentOid, AccessExclusiveLock); > In [2], Amit Langote has given a reason as to why heap_drop_with_catalog() > should not heap_open() the parent relation. But this patch still calls > heap_open() without giving any counter argument. Also I don't see > get_default_partition_oid() using Relation anywhere. If you remove that > heap_open() please remove following heap_close(). > I think the patch 0004 exactly does what you have said here, i.e. it gets rid of the heap_open() and heap_close(). The question might be why I kept the patch 0004 a separate one, and the answer is I wanted to make it easier for review, and also keeping it that way would make it bit easy to work on a different approach if needed. About this: *"Also I don't see get_default_partition_oid() using Relation anywhere."* The get_default_partition_oid() uses parent relation to retrieve PartitionDesc from parent. Kindly let me know if you think I am still missing anything. Regards, Jeevan Ladhe
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Mark Rofail wrote: > These are limitations of the patch ordered by importance: > > ✗ presupposes that count(distinct y) has exactly the same notion of > equality that the PK unique index has. In reality, count(distinct) will > fall back to the default btree opclass for the array element type. Operators are classified in operator classes; each data type may have more than one operator class for a particular access method. Exactly one operator class for some access method can be designated as the default one for a type. However, when you create an index, you can indicate which operator class to use, and it may not be the default one. If a different one is chosen at index creation time, then a query using COUNT(distinct) will do the wrong thing, because DISTINCT will select an equality type using the type's default operator class, not the equality that belongs to the operator class used to create the index. That's wrong: DISTINCT should use the equality operator that corresponds to the index' operator class instead, not the default one. I hope that made sense. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: In-place index updates and HOT (Was: [HACKERS] Patch: Write Amplification Reduction Method (WARM))
Claudio Freirewrote: README.HOT says that that cost is not worth the benefit of preventing a new index write, but I think that it ought to take into account that not all index writes are equal. There is an appreciable difference between inserting a new tuple, and updating one in-place. We can remove the cost (hurting new snapshots by making them go through old heap pages) while preserving most of the benefits (no logically unnecessary index bloat). It's a neat idea. Thanks. I think it's important to both prevent index bloat, and to make sure that only the latest version is pointed to within indexes. There are only so many ways that that can be done. I've tried to come up with a way of doing those two things that breaks as little of heapam.c as possible. As a bonus, some kind of super-pruning of many linked HOT chains may be enabled, which is something that an asynchronous process can do when triggered by a regular prune within a user backend. This is a kind of micro-vacuum that is actually much closer to VACUUM than the kill_prior_tuple stuff, or traditional pruning, in that it potentially kills index entries (just those that were not subsequently updated in place, because the new values for the index differed), and then kills heap tuples, all together, without even keeping around a stub itemId in the heap. And, chaining together HOT chains also lets us chain together pruning. Retail index tuple deletion from pruning needs to be crash safe, unlike LP_DEAD setting. And, well, now that you mention, you don't need to touch indexes at all. You can create the new chain, and "update" the index to point to it, without ever touching the index itself, since you can repoint the old HOT chain's start line pointer to point to the new HOT chain, create a new pointer for the old one and point to it in the new HOT chain's t_tid. Existing index tuples thus now point to the right HOT chain without having to go into the index and make any changes. You do need the new HOT chain to live in the same page for this, however. That seems complicated. The idea that I'm trying to preserve here is the idea that the beginning of a HOT-chain (a definition that includes a "potential HOT chain" -- a single heap tuple that could later receive a HOT UPDATE) unambiguously signals a need for physical changes to indexes in all cases. The idea that I'm trying to move away from is that those physical changes need to be new index insertions (new insertions should only happen when it is logically necessary, because indexed values changed). Note that this can preserve the kill_prior_tuple stuff, I think, because if everything is dead within a single HOT chain (a HOT chain by our current definition -- not a chain of HOT chains) then nobody can need the index tuple. This does require adding complexity around aborted transactions, whose new (potential) HOT chain t_tid "backpointer" is still needed; we must revise the definition of a HOT chain being all_dead to accommodate that. But for the most part, we preserve HOT chains as a thing that garbage collection can independently reason about, process with single page atomic operations while still being crash safe, etc. As far as microvacuum style garbage collection goes, at a high level, HOT chains seem like a good choke point to do clean-up of both heap tuples (pruning) and index tuples. The complexity of doing that seems manageable. And by chaining together HOT chains, you can really aggressively microvacuum many HOT chains on many pages within an asynchronous process as soon as the long running transaction goes away. We lean on temporal locality for garbage collection. There are numerous complications that I haven't really acknowledged but am at least aware of. For one, when I say "update in place", I don't necessarily mean it literally. It's probably possible to literally update in place with unique indexes. For secondary indexes, which should still have heap TID as part of their keyspace (once you go implement that, Claudio), we probably need an index insertion immediately followed by an index deletion, often within the same leaf page. I hope that this design, such as it is, will be reviewed as a thought experiment. What would be good or bad about a design like this in the real world, particularly as compared to alternatives that we know about? Is *some* "third way" design desirable and achievable, if not this one? By "third way" design, I mean a design that is much less invasive than adopting UNDO for MVCC, that still addresses the issues that we currently have with certain types of UPDATE-heavy workloads, especially when there are long running transactions, etc. I doubt that WARM meets this standard, unfortunately, because it doesn't do anything for cases that suffer only due to a long running xact. I don't accept that there is a rigid dichotomy between Postgres style MVCC, and using UNDO for MVCC, and I most certainly don't accept that garbage
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
These are limitations of the patch ordered by importance: ✗ presupposes that count(distinct y) has exactly the same notion of equality that the PK unique index has. In reality, count(distinct) will fall back to the default btree opclass for the array element type. - Supported actions: ✔ NO ACTION ✔ RESTRICT ✗ CASCADE ✗ SET NULL ✗ SET DEFAULT ✗ coercion is unsopported. i.e. a numeric can't refrence int8 ✗ Only one "ELEMENT" column allowed in a multi-column key ✗ undesirable dependency on default opclass semantics in the patch, which is that it supposes it can use array_eq() to detect whether or not the referencing column has changed. But I think that can be fixed without undue pain by providing a refactored version of array_eq() that can be told which element-comparison function to use ✗ cross-type FKs are unsupported -- Resolved limitations = ✔ fatal performance issues. If you issue any UPDATE or DELETE against the PK table, you get a query like this for checking to see if the RI constraint would be violated: SELECT 1 FROM ONLY fktable x WHERE $1 = ANY (fkcol) FOR SHARE OF x; /* Changed into SELECT 1 FROM ONLY fktable x WHERE $1 @> fkcol FOR SHARE OF x; */ -- Can someone help me understand the first limitation?
Re: [HACKERS] segfault in HEAD when too many nested functions call
Andres Freundwrites: > [ 0002-Move-ExecProcNode-from-dispatch-to-function-pointer-.patch ] Here's a reviewed version of this patch. I added dummy ExecProcNodeMtd functions to the various node types that lacked them because they expect to be called through MultiExecProcNode instead. In the old coding, trying to call ExecProcNode on one of those node types would have led to a useful error message; as you had it, it'd have dumped core, which is not an improvement. Also, I removed the ExecReScan stanza from ExecProcNodeFirst; that should surely be redundant, because we should only get to that function through ExecProcNode(). If somehow it's not redundant, please add a comment explaining why not. Some other minor cosmetic changes, mostly comment wordsmithing. I think this (and the previous one) are committable. regards, tom lane diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c index 20fd9f8..d338cfe 100644 --- a/src/backend/executor/execProcnode.c +++ b/src/backend/executor/execProcnode.c @@ -17,15 +17,10 @@ *- */ /* - * INTERFACE ROUTINES - * ExecInitNode - initialize a plan node and its subplans - * ExecProcNode - get a tuple by executing the plan node - * ExecEndNode - shut down a plan node and its subplans - * * NOTES * This used to be three files. It is now all combined into - * one file so that it is easier to keep ExecInitNode, ExecProcNode, - * and ExecEndNode in sync when new nodes are added. + * one file so that it is easier to keep the dispatch routines + * in sync when new nodes are added. * * EXAMPLE * Suppose we want the age of the manager of the shoe department and @@ -122,6 +117,10 @@ #include "miscadmin.h" +static TupleTableSlot *ExecProcNodeFirst(PlanState *node); +static TupleTableSlot *ExecProcNodeInstr(PlanState *node); + + /* * ExecInitNode * @@ -149,6 +148,13 @@ ExecInitNode(Plan *node, EState *estate, int eflags) if (node == NULL) return NULL; + /* + * Make sure there's enough stack available. Need to check here, in + * addition to ExecProcNode() (via ExecProcNodeFirst()), to ensure the + * stack isn't overrun while initializing the node tree. + */ + check_stack_depth(); + switch (nodeTag(node)) { /* @@ -365,6 +371,13 @@ ExecInitNode(Plan *node, EState *estate, int eflags) } /* + * Add a wrapper around the ExecProcNode callback that checks stack depth + * during the first execution. + */ + result->ExecProcNodeReal = result->ExecProcNode; + result->ExecProcNode = ExecProcNodeFirst; + + /* * Initialize any initPlans present in this node. The planner put them in * a separate list for us. */ @@ -388,195 +401,51 @@ ExecInitNode(Plan *node, EState *estate, int eflags) } -/* - * ExecProcNode - * - * Execute the given node to return a(nother) tuple. - * +/* + * ExecProcNode wrapper that performs some one-time checks, before calling + * the relevant node method (possibly via an instrumentation wrapper). */ -TupleTableSlot * -ExecProcNode(PlanState *node) +static TupleTableSlot * +ExecProcNodeFirst(PlanState *node) { - TupleTableSlot *result; - - if (node->chgParam != NULL) /* something changed */ - ExecReScan(node); /* let ReScan handle this */ + /* + * Perform stack depth check during the first execution of the node. We + * only do so the first time round because it turns out to not be cheap on + * some common architectures (eg. x86). This relies on an assumption that + * ExecProcNode calls for a given plan node will always be made at roughly + * the same stack depth. + */ + check_stack_depth(); + /* + * If instrumentation is required, change the wrapper to one that just + * does instrumentation. Otherwise we can dispense with all wrappers and + * have ExecProcNode() directly call the relevant function from now on. + */ if (node->instrument) - InstrStartNode(node->instrument); - - switch (nodeTag(node)) - { - /* - * control nodes - */ - case T_ResultState: - result = ExecResult((ResultState *) node); - break; - - case T_ProjectSetState: - result = ExecProjectSet((ProjectSetState *) node); - break; - - case T_ModifyTableState: - result = ExecModifyTable((ModifyTableState *) node); - break; - - case T_AppendState: - result = ExecAppend((AppendState *) node); - break; - - case T_MergeAppendState: - result = ExecMergeAppend((MergeAppendState *) node); - break; - - case T_RecursiveUnionState: - result = ExecRecursiveUnion((RecursiveUnionState *) node); - break; - - /* BitmapAndState does not yield tuples */ - - /* BitmapOrState does not yield tuples */ - - /* - * scan nodes - */ -
Re: [HACKERS] Causal reads take II
> On 12 July 2017 at 23:45, Thomas Munrowrote: > I renamed it to "synchronous replay", because "causal reads" seemed a bit too > arcane. Hi I looked through the code of `synchronous-replay-v1.patch` a bit and ran a few tests. I didn't manage to break anything, except one mysterious error that I've got only once on one of my replicas, but I couldn't reproduce it yet. Interesting thing is that this error did not affect another replica or primary. Just in case here is the log for this error (maybe you can see something obvious, that I've not noticed): ``` LOG: could not remove directory "pg_tblspc/47733/PG_10_201707211/47732": Directory not empty CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 LOG: could not remove directory "pg_tblspc/47733/PG_10_201707211": Directory not empty CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 LOG: could not remove directory "pg_tblspc/47733/PG_10_201707211/47732": Directory not empty CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 LOG: could not remove directory "pg_tblspc/47733/PG_10_201707211": Directory not empty CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 LOG: directories for tablespace 47733 could not be removed HINT: You can remove the directories manually if necessary. CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 FATAL: could not create directory "pg_tblspc/47734/PG_10_201707211/47732": File exists CONTEXT: WAL redo at 0/125F5768 for Storage/CREATE: pg_tblspc/47734/PG_10_201707211/47732/47736 LOG: startup process (PID 8034) exited with exit code 1 LOG: terminating any other active server processes LOG: database system is shut down ``` And speaking about the code, so far I have just a few notes (some of them merely questions): * In general the idea behind this patch sounds interesting for me, but it relies heavily on time synchronization. As mentioned in the documentation: "Current hardware clocks, NTP implementations and public time servers are unlikely to allow the system clocks to differ more than tens or hundreds of milliseconds, and systems synchronized with dedicated local time servers may be considerably more accurate." But as far as I remember from my own experience sometimes it maybe not that trivial on something like AWS because of virtualization. Maybe it's an unreasonable fear, but is it possible to address this problem somehow? * Also I noticed that some time-related values are hardcoded (e.g. 50%/25% time shift when we're dealing with leases). Does it make sense to move them out and make them configurable? * Judging from the `SyncReplayPotentialStandby` function, it's possible to have `synchronous_replay_standby_names = "*, server_name"`, which is basically an equivalent for just `*`, but it looks confusing. Is it worth it to prevent this behaviour? * In the same function `SyncReplayPotentialStandby` there is this code: ``` if (!SplitIdentifierString(rawstring, ',', )) { /* syntax error in list */ pfree(rawstring); list_free(elemlist); /* GUC machinery will have already complained - no need to do again */ return false; } ``` Am I right that ideally this (a situation when at this point in the code `synchronous_replay_standby_names` has incorrect value) should not happen, because GUC will prevent us from that? If yes, then it looks for me that it still makes sense to put here a log message, just to give more information in a potentially weird situation. * In the function `SyncRepReleaseWaiters` there is a commentary: ``` /* * If the number of sync standbys is less than requested or we aren't * managing a sync standby or a standby in synchronous replay state that * blocks then just leave. * / if ((!got_recptr || !am_sync) && !walsender_sr_blocker) ``` Is this commentary correct? If I understand everything right `!got_recptr` - the number of sync standbys is less than requested (a), `!am_sync` - we aren't managing a sync standby (b), `walsender_sr_blocker` - a standby in synchronous replay state that blocks (c). Looks like condition is `(a or b) and not c`. * In the function `ProcessStandbyReplyMessage` there is a code that implements this: ``` * If the standby reports that it has fully replayed the WAL for at least * 10 seconds, then let's clear the lag times that were measured when it * last wrote/flushed/applied a WAL record. This way we avoid displaying * stale lag data until more WAL traffic arrives. ``` but I never found any mention of this 10 seconds in the documentation. Is it not that important? Also, question 2 is related to this one. * In the function `WalSndGetSyncReplayStateString` all the states are in lower case except `UNKNOWN`, is there any particular reason for that? There are also few more not that important notes (mostly about some typos and few confusing names), but I'm going to do another round of review and testing anyway so I'll just send them all
Re: [HACKERS] segfault in HEAD when too many nested functions call
On 2017-07-29 14:20:32 -0400, Tom Lane wrote: > Here's a reviewed version of this patch. Thanks! > * I think you put ExecScan's CFI in the wrong place; AFAICT yours > only covers its fast path. Sure - but the old path already has a CFI? And it has to be inside the loop, because well, the loop ;). > * I think ExecAgg needs a CFI at the head, just to be sure it's hit > in any path through that. Yep, makes esense. > * I agree that putting CFI inside ExecHashJoin's state machine loop > is a good idea, because it might have to trawl through quite a lot of > a batch file before finding a returnable tuple. But I think in merge > and nestloop joins it's sufficient to put one CFI at the head. Neither > of those cases can do very much processing without invoking a child > node, where a CFI will happen. Ok, I can live with that. > * You missed ExecLockRows altogether. Well, it directly calls the next ExecProcNode(), so I didn't think it was necessary. One of the aforementioned judgement calls. But I'm perfectly happy to have one there. Thanks, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] segfault in HEAD when too many nested functions call
Andres Freundwrites: > On 2017-07-26 16:28:38 -0400, Tom Lane wrote: >> It's certainly possible that there are long-running loops not involving >> any ExecProcNode recursion at all, but that would be a bug independent >> of this issue. The CFI in ExecProcNode itself can be replaced exactly >> either by asking all callers to do it, or by asking all callees to do it. >> I think the latter is going to be more uniform and harder to screw up. > Looks a bit better. Still a lot of judgement-y calls tho, e.g. when one > node function just calls the next, or when there's loops etc. I found > a good number of missing CFIs... > What do you think? Here's a reviewed version of this patch. Differences from yours: * I think you put ExecScan's CFI in the wrong place; AFAICT yours only covers its fast path. * I think ExecAgg needs a CFI at the head, just to be sure it's hit in any path through that. * I agree that putting CFI inside ExecHashJoin's state machine loop is a good idea, because it might have to trawl through quite a lot of a batch file before finding a returnable tuple. But I think in merge and nestloop joins it's sufficient to put one CFI at the head. Neither of those cases can do very much processing without invoking a child node, where a CFI will happen. * You missed ExecLockRows altogether. * I added some comments and cosmetic tweaks. regards, tom lane diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c index 294ad2c..20fd9f8 100644 --- a/src/backend/executor/execProcnode.c +++ b/src/backend/executor/execProcnode.c @@ -399,8 +399,6 @@ ExecProcNode(PlanState *node) { TupleTableSlot *result; - CHECK_FOR_INTERRUPTS(); - if (node->chgParam != NULL) /* something changed */ ExecReScan(node); /* let ReScan handle this */ diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c index 4f131b3..6fde7cd 100644 --- a/src/backend/executor/execScan.c +++ b/src/backend/executor/execScan.c @@ -126,6 +126,8 @@ ExecScan(ScanState *node, ExprState *qual; ProjectionInfo *projInfo; + CHECK_FOR_INTERRUPTS(); + /* * Fetch data from node */ diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index de9a18e..377916d 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -677,6 +677,8 @@ fetch_input_tuple(AggState *aggstate) if (aggstate->sort_in) { + /* make sure we check for interrupts in either path through here */ + CHECK_FOR_INTERRUPTS(); if (!tuplesort_gettupleslot(aggstate->sort_in, true, false, aggstate->sort_slot, NULL)) return NULL; @@ -1414,6 +1416,8 @@ process_ordered_aggregate_multi(AggState *aggstate, while (tuplesort_gettupleslot(pertrans->sortstates[aggstate->current_set], true, true, slot1, )) { + CHECK_FOR_INTERRUPTS(); + /* * Extract the first numTransInputs columns as datums to pass to the * transfn. (This will help execTuplesMatch too, so we do it @@ -2100,6 +2104,8 @@ ExecAgg(AggState *node) { TupleTableSlot *result = NULL; + CHECK_FOR_INTERRUPTS(); + if (!node->agg_done) { /* Dispatch based on strategy */ @@ -2563,6 +2569,8 @@ agg_retrieve_hash_table(AggState *aggstate) TupleTableSlot *hashslot = perhash->hashslot; int i; + CHECK_FOR_INTERRUPTS(); + /* * Find the next entry in the hash table */ diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c index aae5e3f..58045e0 100644 --- a/src/backend/executor/nodeAppend.c +++ b/src/backend/executor/nodeAppend.c @@ -59,6 +59,7 @@ #include "executor/execdebug.h" #include "executor/nodeAppend.h" +#include "miscadmin.h" static bool exec_append_initialize_next(AppendState *appendstate); @@ -204,6 +205,8 @@ ExecAppend(AppendState *node) PlanState *subnode; TupleTableSlot *result; + CHECK_FOR_INTERRUPTS(); + /* * figure out which subplan we are currently processing */ diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c index 7e0ba03..cf109d5 100644 --- a/src/backend/executor/nodeBitmapHeapscan.c +++ b/src/backend/executor/nodeBitmapHeapscan.c @@ -41,6 +41,7 @@ #include "access/transam.h" #include "executor/execdebug.h" #include "executor/nodeBitmapHeapscan.h" +#include "miscadmin.h" #include "pgstat.h" #include "storage/bufmgr.h" #include "storage/predicate.h" @@ -192,6 +193,8 @@ BitmapHeapNext(BitmapHeapScanState *node) Page dp; ItemId lp; + CHECK_FOR_INTERRUPTS(); + /* * Get next page of results if needed */ diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c index 69e2704..fc15974 100644 --- a/src/backend/executor/nodeCustom.c +++ b/src/backend/executor/nodeCustom.c @@ -15,6 +15,7 @@ #include "executor/nodeCustom.h" #include "nodes/execnodes.h" #include "nodes/plannodes.h" +#include "miscadmin.h" #include
Re: [HACKERS] Memory leak
You are right. When I add those code, it really give me a strong bad smell. It not worth these effort. Thanks for your reply and suggestion. -- Sincerely Fan Yang
Re: [HACKERS] Memory leak
fan yangwrites: > - src/port/quotes.c > At line 38, at function "escape_single_quotes_ascii", > here used "malloc" to get some memory and return the > pointer returned by the "malloc". > So, any caller used this function shoule free this memory. > - /src/bin/initdb/initdb.c > At line 327, at function "escape_quotes", > which use function "escape_single_quotes_ascii" > from above file. > But at this file(/src/bin/initdb/initdb.c), there are many place > used function "escape_quotes" but have not free the pointer returned > by the function, thus cause memory leak. By and large, we intentionally don't worry about memory leaks in initdb (or any other program with a limited runtime). It's not worth the maintenance effort to save a few bytes, at least not where it requires code contortions like these. Doing a quick check with valgrind says that a run of initdb, in HEAD, leaks about 560KB over its lifespan. That's well below the threshold of pain on any machine capable of running modern Postgres reasonably. For fun, I tried to see whether your patch moved that number appreciably, but patch(1) couldn't make any sense of it at all. I think that probably your mail program munged the whitespace in the patch. Many of us have found that the most reliable way to forward patches through email is to add them as attachments rather than pasting them into the text in-line. Poking at it a bit harder with valgrind, it seems that the vast majority of what it reports as leaks is caused by replace_token(). If we wanted to reduce memory wastage during initdb, that would be the place to work on. But I'm dubious that it's worth any effort. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Memory leak
Hi all, While reading the code, I find some code that make memory leak: - src/port/quotes.c At line 38, at function "escape_single_quotes_ascii", here used "malloc" to get some memory and return the pointer returned by the "malloc". So, any caller used this function shoule free this memory. - /src/bin/initdb/initdb.c At line 327, at function "escape_quotes", which use function "escape_single_quotes_ascii" from above file. But at this file(/src/bin/initdb/initdb.c), there are many place used function "escape_quotes" but have not free the pointer returned by the function, thus cause memory leak. I have fixed this bug and made a patch here: diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index 7303bbe..4e5429e 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -1010,6 +1010,7 @@ setup_config(void) charpath[MAXPGPATH]; const char *default_timezone; char *autoconflines[3]; +char *escaped; fputs(_("creating configuration files ... "), stdout); fflush(stdout); @@ -1043,20 +1044,28 @@ setup_config(void) conflines = replace_token(conflines, "#port = 5432", repltok); #endif +escaped = escape_quotes(lc_messages); snprintf(repltok, sizeof(repltok), "lc_messages = '%s'", - escape_quotes(lc_messages)); + escaped); +free(escaped); conflines = replace_token(conflines, "#lc_messages = 'C'", repltok); +escaped = escape_quotes(lc_monetary); snprintf(repltok, sizeof(repltok), "lc_monetary = '%s'", - escape_quotes(lc_monetary)); + escaped); +free(escaped); conflines = replace_token(conflines, "#lc_monetary = 'C'", repltok); +escaped = escape_quotes(lc_numeric); snprintf(repltok, sizeof(repltok), "lc_numeric = '%s'", - escape_quotes(lc_numeric)); + escaped); +free(escaped); conflines = replace_token(conflines, "#lc_numeric = 'C'", repltok); +escaped = escape_quotes(lc_time); snprintf(repltok, sizeof(repltok), "lc_time = '%s'", - escape_quotes(lc_time)); + escaped); +free(escaped); conflines = replace_token(conflines, "#lc_time = 'C'", repltok); switch (locale_date_order(lc_time)) @@ -1074,9 +1083,11 @@ setup_config(void) } conflines = replace_token(conflines, "#datestyle = 'iso, mdy'", repltok); +escaped = escape_quotes(default_text_search_config); snprintf(repltok, sizeof(repltok), "default_text_search_config = 'pg_catalog.%s'", - escape_quotes(default_text_search_config)); + escaped); +free(escaped); conflines = replace_token(conflines, "#default_text_search_config = 'pg_catalog.simple'", repltok); @@ -1084,11 +1095,13 @@ setup_config(void) default_timezone = select_default_timezone(share_path); if (default_timezone) { +escaped = escape_quotes(default_timezone); snprintf(repltok, sizeof(repltok), "timezone = '%s'", - escape_quotes(default_timezone)); + escaped); conflines = replace_token(conflines, "#timezone = 'GMT'", repltok); snprintf(repltok, sizeof(repltok), "log_timezone = '%s'", - escape_quotes(default_timezone)); + escaped); +free(escaped); conflines = replace_token(conflines, "#log_timezone = 'GMT'", repltok); } @@ -1289,6 +1302,7 @@ bootstrap_template1(void) char **bki_lines; charheaderline[MAXPGPATH]; charbuf[64]; +char *escaped; printf(_("running bootstrap script ... ")); fflush(stdout); @@ -1330,13 +1344,19 @@ bootstrap_template1(void) bki_lines = replace_token(bki_lines, "FLOAT8PASSBYVAL", FLOAT8PASSBYVAL ? "true" : "false"); -bki_lines = replace_token(bki_lines, "POSTGRES", escape_quotes(username)); +escaped = escape_quotes(username); +bki_lines = replace_token(bki_lines, "POSTGRES", escaped); +free(escaped); bki_lines = replace_token(bki_lines, "ENCODING", encodingid); -bki_lines = replace_token(bki_lines, "LC_COLLATE", escape_quotes(lc_collate)); +escaped = escape_quotes(lc_collate); +bki_lines = replace_token(bki_lines, "LC_COLLATE", escaped); +free(escaped); -bki_lines = replace_token(bki_lines, "LC_CTYPE", escape_quotes(lc_ctype)); +escaped = escape_quotes(lc_ctype); +bki_lines = replace_token(bki_lines, "LC_CTYPE", escaped); +free(escaped); /* * Pass correct LC_xxx environment to bootstrap. @@ -1391,13 +1411,16 @@ setup_auth(FILE *cmdfd) "REVOKE ALL on pg_authid FROM public;\n\n", NULL }; +char*escaped; for (line = pg_authid_setup; *line != NULL; line++) PG_CMD_PUTS(*line); +escaped = escape_quotes(superuser_password); if