[HACKERS] Improper usage of MemoryContext in nodeSubplan.c for buildSubPlanHash() function. This maybe causes allocate memory failed.
Hi all, This is a potential memory error in nodeSubplan.c or execGrouping.c Using select '1'::TEXT IN ((SELECT '1'::NAME) UNION ALL SELECT '1'::NAME); to reproduce this bug. You may see the memory content that slot1's tts_values[0] point to before and after the statement : MemoryContextReset(evalContext) of execTuplesMatch() in PG version with --enable-cassert switch on. You will find that the memory is set to 0x7f7f7f7f. Here are my code analysis: For the executor node SubPlanState, The node has a ExprContext named 'innerecontext' that is created by itself in function ExecInitSubPlan(). Also, the 'innerecontext' is used to build a project info, 'projRight', of SubPlanState node. When the SubPlanState node is executed, a hash table of subplan will be built, so buildSubPlanHash() will be called. In buildSubPlanHash function, 'hashtable' of SubPlanState will be initialized by calling BuildTupleHashTable(), and the code passes the 'ecxt_per_tuple_memory' of 'innerecontext' to BuildTupleHashTable() as the 'hashtable''s tempcxt. At this point, we can conclude that the 'projRight' and 'hashtable''s 'tempcxt' of SubPlanState node are all using the same memory context, that is 'innerecontext' in SubPlanState node. So: 1) The memory of all the fetched tuples from 'projRight' will be located in 'innerecontext' of SubPlanState node. 2) All other intermediate result that is located in 'hashtable''s tempcxt, also in fact, will reside in the 'innerecontext' of SubPlanState node. Now next: In buildSubPlanHash(), we will fetch tuple from 'projRight' to fill the hash table of SubPlanState node. As we known, all the fetched tuples are located in the 'innerecontext'. When filling the tuple hash table, if the tuple already exists in the hash table of SubPlanState node, the match function execTuplesMatch() will be called by tuples matching, but in this function, the statement: MemoryContextReset(evalContext); will be reset 'evalContext', to free some memory usage. In fact, the 'evalContext' is equal to 'innerecontext' that the fetched tuples are located in, and actually this statement will reset the 'innerecontext'. So the fetched tuple's memory are all lost. That's the problem. In fact, this error only in debug version, but not in release version. In debug using --enable-cassert to configure, the memory will be set to 0x7f7f7f7f, but not for release. For this error memory usage, the pg does not always report error because the 0x7f7f7f in testing Macro VARATT_IS_COMPRESSED() and VARATT_IS_EXTERNAL() are always false in !WORDS_BIGENDIAN platform such as i386 GNU Linux and in WORDS_BIGENDIAN, the testing Macro VARATT_IS_COMPRESSED() will be true and error may be reported such as AIX(Power 6 AIX V6.1). To fix this problem, we can use another memory context to passin BuildTupleHashTable() as the hashtable's tempcxt, or use other memory context as hash table's tempcxt or other ways. Any questions, please contact me. Regards -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question about the _SPI_save_plan() and plan cache
"Tom Lane" writes: > "Tao Ma" writes: >> I knew that the delete_function() will reclaim the memory context >> allocated for the function. But I did not find any code for removing >> the plan(SPI plan memory context), saved by calling _SPI_save_plan. > > Hmmm ... good point, those probably won't get cleaned up. In an > environment where functions are getting changed constantly, that > might be worth doing. > > regards, tom lane > Hi, I just paste a re-produce sql script. Is it possible to cache the SPI plan under the function cache context? Thanks. begin 666 spi_plan_leak_eg.sql M0U)%051%($Q!3D=504=%('!L<&=S<6P[#0H-"BTM(&=E;F5R871E(&$@:'5G M92!F=6YC=&EO;@T*0U)%051%($]2(%)%4$q!...@1e5.0u1)3...@9g5n8u]g M96YE2X` ` end -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question about the _SPI_save_plan() and plan cache
I knew that the delete_function() will reclaim the memory context allocated for the function. But I did not find any code for removing the plan(SPI plan memory context), saved by calling _SPI_save_plan. Is the plan memory context freed when someone issued CREATE OR REPLACE FUNCTION? Thanks. "Tom Lane" wrotes:25310.1248791...@sss.pgh.pa.us... > "Tao Ma" writes: >> Once we DROP the function, the memory consumed >> by the plan will be leak. > > I'm pretty unconcerned about DROP FUNCTION. The case that seems worth > worrying about is CREATE OR REPLACE FUNCTION, and in that case we'll > reclaim the storage on the next call of the function. > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] question about the _SPI_save_plan() and plan cache
Hi, It seems postgres cache the plan under CacheMemoryContext during the plpgsql executing. If there is a function with lots of variables and every one of them got a default value, postgres will allocate lots of memory for caching the default value plan(we have to run the function at least once). Once we DROP the function, the memory consumed by the plan will be leak. The same thing for the compiled function structure(PLpgSQL_function). Is this an oldest known question? Thanks, Tao Ma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question about deparsing const node and its typmod
Thank you for your reply to the question. If it was chosen to reproduce the actual semantics of the expression in various contexts, I think the bpchar type of 'abc'::bpchar is surprised me. Is it really important to show the 'bpchar' if there is no any explicit casting for the column default value. I searched for 'default' and 'bpchar' in CVS history and failed to figure out the reason why the 'bpchar' need to be displayed. Maybe for the CHECK expression or the default value including operator or function the category type can be extremely usefuly to help us understand the type conversion result. Is it possible to omit the bpchar or something like it, if it is a const node for the default value? "Tom Lane" writes:18655.1238258...@sss.pgh.pa.us... > "Tao Ma" writes: >> CREATE TABLE "t" (c1 CHAR(5) DEFAULT 'abc', >> c2 CHAR(5) DEFAULT 'abc'::CHAR(5)); > >> SELECT pg_get_expr(adbin, adrelid) >> FROM pg_attrdef >> WHERE adrelid = (SELECT oid FROM pg_class WHERE relname = 't'); > >> pg_get_expr >> - >> 'abc'::bpchar >> 'abc'::character(5) >> (2 rows) > >> so I am courious about is there any possibility to make the default value >> for c1 look like the default value for c2. > > That behavior is very carefully chosen to reproduce the actual semantics > of the expression in various contexts. We can't change it just to make > it "look prettier". > > If you check the CVS history of ruleutils.c to see when that logic got > changed, you should be able to locate pgsql-hackers discussions that > worked out what the behavior has to be. I seem to remember that the > most recent iteration had to do with making sure that ALTER COLUMN TYPE > had unsurprising side-effects on the column's default. > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] question about deparsing const node and its typmod
Hi, Recently, I am reading the postgres codes, and I have a question about the deparsing some expressions which is contains Const node. The following SQL will retrieve the definition stored by postgres database for table "t": CREATE TABLE "t" (c1 CHAR(5) DEFAULT 'abc', c2 CHAR(5) DEFAULT 'abc'::CHAR(5)); SELECT pg_get_expr(adbin, adrelid) FROM pg_attrdef WHERE adrelid = (SELECT oid FROM pg_class WHERE relname = 't'); pg_get_expr - 'abc'::bpchar 'abc'::character(5) (2 rows) Postgres will emit a implicit coercion for the default value of c1 column before the definition node can be stored in system catalog pg_attrdef. To retrieve a human-readable definition for the default value, pg_get_expr() will call get_func_expr() to display the default value. get_func_expr() will omit the implicit function and return the first argument barely. The default value for column c2 is pretty than the default value for column c1, so I am courious about is there any possibility to make the default value for c1 look like the default value for c2. If we do not concern the compatible with old system, is it possible to modify the function transformExpr() or something else to archieve the the 'pretty'(may be not pretty at all for someone) format? It seems assign the correct typmod to Const node during transform phase is possible, but most of time it is meaningless. typmod plays an important role during the process of coercion decision, it seems the coercion will absence iff the type and typmod is same. Thanks in advance, Tao Ma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question about the design consideration for system catalogs
"Tom Lane" wrotes: > "Tao Ma" writes: >> But if I wanna check the parameter informations, it's a little hard to >> read. If I wanna write a program to re-construct the procedure source >> code, it is not convenience to access the parameter informations from >> the front-end. What cons are there If store the procedure and its >> parameters in different tables(e.g. pg_attribute and pg_attrdef)? > > Loss of backwards compatibility, for one thing. We're not going to > redesign those catalogs just because somebody thinks some other layout > would be more convenient for one purpose --- it would break a lot of > code for little gain. > Thank you. Actually, I don't want to change anything. I am curious why we use this form, such as it is easy to access the informations internally, and so on. Or this form is jsut historical reasons, we follow the original patterns. I know the system catalogs is very important and critical to the DBMS. So pls don't mind me and my poor questions. TIA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] question about the design consideration for system catalogs
Howdy, I have a question about the design consideration for system catalogs. Some of tables include arrays to hold the variable informations, such as tracking the procedure declaration informations in pg_proc. To store the informations for procedure parameters, there are proargtypes, proallargtypes, proargmodes and proargnames colums. Store the parameters in this way is much easier to access these informations for backend developper. Each procedure informations can be retrieved just in one query and store them in the Form_pg_proc structure directly. But if I wanna check the parameter informations, it's a little hard to read. If I wanna write a program to re-construct the procedure source code, it is not convenience to access the parameter informations from the front-end. What cons are there If store the procedure and its parameters in different tables(e.g. pg_attribute and pg_attrdef)? Access two tables is slower than access one tables? It needs more code to access the procedure informations? I think store them separately is more common. TIA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] some questions about SELECT FOR UPDATE/SHARE
Hi, I noticed Tom Lane made SELECT FOR UPDATE/SHARE work on inheritance trees. But todo list is not updated to reflect this change. http://wiki.postgresql.org/wiki/Todo#Inheritance Actually, there is a patch trying to remove the ONLY clause on the foreign key checks on the inheritance tables. Matt Newell, the author of the patch, remove too many things(both ONLY and FOR SHARE) to achieve his aims. At that time, SELECT FOR UPDATE/SHARE cannot work on inheritance tables. Without FOR UPDATE/SHARE, there will be some bugs, if there are concurrent updates(Tom Lane said). the discussions about the Matt Newell's path are here: http://archives.postgresql.org/pgsql-patches/2005-11/msg00062.php My question is: Is it possible to remove the ONLY from the RI checks after Tom Lane made SELECT FOR UPDATE/SHARE work on inheritance trees? Best wishes -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why declare arg as a array in FunctionCallInfoData structure
Thank you guys... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] why declare arg as a array in FunctionCallInfoData structure
hi, When I read the postgresql codes, I noticed that the FunctionCallInfoData structure(declared in the src/include/fmgr.h) contains two arrays 'arg' and 'argnull'. Why don't you declare it as a pointer and allocate the memory from heap? It saves more momery if 'arg' and 'argnull' declares as pointer type. Can anyone explain it to me? Thanks in advance. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] huge query tree cost too much time to copyObject()
Hi, recently, I wrote a really complex SELECT statement which consists of about 20 relations using NATURAL JOIN method and every single relation consists 50 columns. It looks like: PREPARE ugly_stmt AS SELECT * FROM t1 NATURAL JOIN t2 NATURAL JOIN t3 ... NATURAL JOIN t20 WHERE id = $1; All tables share only one common column "id" which is also defined as primay key. I set join_collapse_limit to 1 and just write a prepare statement for calling multi-times. It seems Postgres cost lots of time to copyObject(). So can I just allocate a new context from TopMemoryContext before doing QueryRewrite() and pg_plan_queries(), and save them into hash table without copying query_list and plan_list again(i think they are lived in the context I created). I know I am subjected a long term memory leak until I deallocate the prepared statement, but it save lots of time in my situation. And I can bear with it... Thanks in advance -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers