[HACKERS] PROPOSAL of xmlvalidate
Hi, I am working on patch adding xmlvalidate() functionality. LibXML 2.7.7 improved DTD, XSD, Relax-NG validation, so using that. I have idea of creating system table for holding DTDs, XSDs, Relax-NGs (similar as on ORACLE). Is that good idea? If so, how to implement that table? pg_attribute and pg_class had changed guite from PostgresSQL 8.0. Including code, work on progress. bool validate_xml_with_xsd(const char* xsd, const char* xml) { bool result = false; xmlSchemaPtr schema = NULL; // pointer to schema xmlLineNumbersDefault(1); // set line numbering in xml if any errors occurated create xsd xmlSchemaParserCtxtPtr ctxt; ctxt = xmlSchemaNewMemParserCtxt((const char*)xsd, strlen(xsd)); xmlSchemaSetParserErrors(ctxt, (xmlSchemaValidityErrorFunc) fprintf, (xmlSchemaValidityWarningFunc) fprintf, stderr); schema = xmlSchemaParse(ctxt); xmlSchemaFreeParserCtxt(ctxt); if (schema == NULL) { elog(ERROR, ERROR with DTD); } /// crate XML xmlDocPtr doc; doc = xmlReadDoc((char *)xml ,http://www.w3.org/2001/XMLSchema,NULL,0); if (doc == NULL) { elog(ERROR, nepodarilo se nacist xml soubor ze vstupu); } else { xmlSchemaValidCtxtPtr ctxt; int ret; ctxt = xmlSchemaNewValidCtxt(schema); xmlSchemaSetValidErrors(ctxt, (xmlSchemaValidityErrorFunc) fprintf, (xmlSchemaValidityWarningFunc) fprintf, stderr); ret = xmlSchemaValidateDoc(ctxt, doc); if (ret == 0) { result = true; elog(WARNING, validation SUCCED); } else if (ret 0) { result = false; elog(WARNING, not validated); } else { result = false; elog(WARNING, validation failed with internal error); } xmlSchemaFreeValidCtxt(ctxt); xmlFreeDoc(doc); } if (schema != NULL) { // free xmlSchemaFree(schema); } return result; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name
On Sat, Nov 27, 2010 at 2:17 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Thanks! The _oid variants will have to re-appear in the alter extension set schema patch, which is the last of the series. Meanwhile, I will have to merge head with the current extension patch (already overdue for a new version, waiting on purpose) so that it no longer includes the cfparser and execute from file patches too (which have changed a lot underneath it already). I expected as much, but at least at that point it will be possible to test that code. I'm not sure there's lots of precedent for dealing with in-commitfest patches dependencies, so here's a summary of what I think would ideally happen next (ordering counts): 1. cfparser https://commitfest.postgresql.org/action/patch_view?id=413 ready for commiter 2. pg_execute_from_file https://commitfest.postgresql.org/action/patch_view?id=414 needs another review and maybe some more documentation 3. extensions https://commitfest.postgresql.org/action/patch_view?id=404 needs review and minor updating will need another version after merging the two previous patches 4. alter extension set schema https://commitfest.postgresql.org/action/patch_view?id=416 needs review and a reviewer will need bitrot fix (and adding in the _oid variants) would be better to adjust once the 3 previous are in Thanks, that's very helpful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib: auth_delay module
On Sat, Nov 27, 2010 at 2:44 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Thu, Nov 4, 2010 at 6:35 AM, Stephen Frost sfr...@snowman.net wrote: * Jan Urbański (wulc...@wulczer.org) wrote: On 04/11/10 14:09, Robert Haas wrote: Hmm, I wonder how useful this is given that restriction. As KaiGai mentined, it's more to make bruteforcing difficult (read: tmie consuming), right? Which it would still do, since the attacker would be bumping up against max_connections. max_connections would be a DOS point, but that's no different from today. I haven' t thought of a way to test this, so I guess I'll just ask. If the attacking client just waits a few milliseconds for a response and then drops the socket, opening a new one, will the server-side walking-dead process continue to be charged against max_connections until it's sleep expires? I'm not sure, either. I suspect the answer is yes. I guess you could test this by writing a loop like this: while true; do psql connection parameters that will fail authentication; done ...and then hitting ^C every few seconds during execution. After doing that for a bit, run select * from pg_stat_activity or ps auxww | grep postgres in another window. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PROPOSAL of xmlvalidate
On 11/28/2010 05:33 AM, Tomáš Pospíšil wrote: Hi, I am working on patch adding xmlvalidate() functionality. LibXML 2.7.7 improved DTD, XSD, Relax-NG validation, so using that. I have idea of creating system table for holding DTDs, XSDs, Relax-NGs (similar as on ORACLE). Is that good idea? If so, how to implement that table? pg_attribute and pg_class had changed guite from PostgresSQL 8.0. In the first place you need to tell us why you think it should go in a catalog table at all. Unless you intend to use some sort of typmod with the xml type, to indicate the validation object, it seems quite unnecessary. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PLy_malloc and plperl mallocs
On 28/11/10 05:23, Andrew Dunstan wrote: On 11/27/2010 10:28 PM, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=wulc...@wulczer.org writes: I noticed that PL/Python uses a simple wrapper around malloc that does ereport(FATAL) if malloc returns NULL. I find it a bit harsh, don't we normally do ERROR if we run out of memory? And while looking at how PL/Perl does these things I find that one failed malloc (in compile_plperl_function) throws an ERROR, and the rest (in plperl_spi_prepare) are simply unguarded... I guess PL/Python should stop throwing FATAL errors and PL/Perl should get its own malloc_or_ERROR helper and start using that. The real question is why they're not using palloc instead. Well, the stuff in plperl_spi_prepare needs to be allocated in a long-lived context. We could use palloc in TopMemoryContext instead, I guess. I'll do that for PL/Python for now. While on the topic of needless FATAL errors, if you try to create a Python 3 function in a session that already loaded Python 2, you get a FATAL error with an errhint of Start a new session to use a different Python major version. If you raise a FATAL error, you don't really have much choice than to start a new session, since the current one just got nuked, no? Should this be an ERROR? Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rethinking representation of sort/hash semantics in queries and plans
On Sat, Nov 27, 2010 at 10:02:33PM -0500, Tom Lane wrote: In recent discussions of the plan-tree representation for KNNGIST index scans, there seemed to be agreement that it'd be a good idea to explicitly represent the expected sort ordering of the output. While thinking about how best to do that, it struck me that there's some pretty horrendous impedance mismatches in the way we do things now. Different parts of the system use two different representations of sort ordering: * A sort operator (which can have or semantics) plus nulls-first flag * A btree opfamily plus direction and nulls-first flags Sounds like a good idea to me. Quite aside from the performance issues, having one way to represent things will make it clearer what's going on and easier to extend in the future. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Patriotism is when love of your own people comes first; nationalism, when hate for people other than your own comes first. - Charles de Gaulle signature.asc Description: Digital signature
[HACKERS] [PATCH] Return command tag 'REPLACE X' for CREATE OR REPLACE statements.
Hi list, Often enough when developing PostgreSQL views and functions, I have pasted the CREATE OR REPLACE commands into the wrong window/shell and ran them there without realizing that I'm creating a function in the wrong database, instead of replacing. Currently psql does not provide any feedback of which action really occured. Only after writing this patch I realized that I could instead raise a NOTICE, like current IF EXISTS/IF NOT EXISTS clauses do. Is that a better way to solve this? This patch returns command tag CREATE X or REPLACE X for LANGAUGE/VIEW/RULE/FUNCTION. This is done by passing completionTag to from ProcessUtility to more functions, and adding a 'bool *didUpdate' argument to some lower-level functions. I'm not sure if passing back the status in a bool* is considered good style, but this way all the functions look consistent. Regards, Marti From b848a4129e31aa021059dab5a0a7ad139fe018b3 Mon Sep 17 00:00:00 2001 From: Marti Raudsepp ma...@juffo.org Date: Sun, 28 Nov 2010 16:49:41 +0200 Subject: [PATCH] Return command tag 'REPLACE X' for CREATE OR REPLACE statements. This affects CREATE OR REPLACE LANGUAGE/VIEW/RULE/FUNCTION --- src/backend/catalog/pg_aggregate.c |3 ++- src/backend/catalog/pg_proc.c |6 +- src/backend/commands/functioncmds.c | 16 +--- src/backend/commands/proclang.c | 30 ++ src/backend/commands/view.c | 18 ++ src/backend/rewrite/rewriteDefine.c | 25 - src/backend/tcop/utility.c |8 src/include/catalog/pg_proc_fn.h|3 ++- src/include/commands/defrem.h |2 +- src/include/commands/proclang.h |2 +- src/include/commands/view.h |2 +- src/include/rewrite/rewriteDefine.h |5 +++-- 12 files changed, 88 insertions(+), 32 deletions(-) diff --git a/src/backend/catalog/pg_aggregate.c b/src/backend/catalog/pg_aggregate.c index eadf41f..31e91b4 100644 --- a/src/backend/catalog/pg_aggregate.c +++ b/src/backend/catalog/pg_aggregate.c @@ -229,7 +229,8 @@ AggregateCreate(const char *aggName, NIL, /* parameterDefaults */ PointerGetDatum(NULL), /* proconfig */ 1, /* procost */ - 0); /* prorows */ + 0, /* prorows */ + NULL); /* didReplace */ /* * Okay to create the pg_aggregate entry. diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index d464979..e456139 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -85,7 +85,8 @@ ProcedureCreate(const char *procedureName, List *parameterDefaults, Datum proconfig, float4 procost, -float4 prorows) +float4 prorows, +bool *didReplace) { Oid retval; int parameterCount; @@ -650,6 +651,9 @@ ProcedureCreate(const char *procedureName, AtEOXact_GUC(true, save_nestlevel); } + if(didReplace) + *didReplace = is_update; + return retval; } diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index 2347cad..4e07c9d 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -770,7 +770,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName, * Execute a CREATE FUNCTION utility statement. */ void -CreateFunction(CreateFunctionStmt *stmt, const char *queryString) +CreateFunction(CreateFunctionStmt *stmt, const char *queryString, char *completionTag) { char *probin_str; char *prosrc_str; @@ -791,7 +791,8 @@ CreateFunction(CreateFunctionStmt *stmt, const char *queryString) Oid requiredResultType; bool isWindowFunc, isStrict, -security; +security, +didReplace; char volatility; ArrayType *proconfig; float4 procost; @@ -958,7 +959,16 @@ CreateFunction(CreateFunctionStmt *stmt, const char *queryString) parameterDefaults, PointerGetDatum(proconfig), procost, - prorows); + prorows, + didReplace); + + if (completionTag) + { + if (didReplace) + strcpy(completionTag, REPLACE FUNCTION); + else + strcpy(completionTag, CREATE FUNCTION); + } } diff --git a/src/backend/commands/proclang.c b/src/backend/commands/proclang.c index 9d0d3b2..10d3cd9 100644 --- a/src/backend/commands/proclang.c +++ b/src/backend/commands/proclang.c @@ -51,7 +51,7 @@ typedef struct static void create_proc_lang(const char *languageName, bool replace, Oid languageOwner, Oid handlerOid, Oid inlineOid, - Oid valOid, bool trusted); + Oid valOid, bool trusted, bool *didReplace); static PLTemplate *find_language_template(const char *languageName); static void AlterLanguageOwner_internal(HeapTuple tup, Relation rel, Oid newOwnerId); @@ -62,7 +62,7 @@ static void AlterLanguageOwner_internal(HeapTuple tup, Relation rel, * - */ void -CreateProceduralLanguage(CreatePLangStmt *stmt)
Re: [HACKERS] [GENERAL] column-level update privs + lock table
On Fri, Nov 26, 2010 at 7:11 PM, Robert Haas robertmh...@gmail.com wrote: I'm not totally convinced that this is the correct behavior. It seems a bit surprising that UPDATE privilege on a single column is enough to lock out all SELECT activity from the table. It's actually a bit surprising that even full-table UPDATE privileges are enough to do this, but this change would allow people to block access to data they can neither see nor modify. That seems counterintuitive, if not a security hole. The way I see it, it's a Good Thing to encourage people to assign UPDATE privileges on tables only as minimally as possible. The damage that a poorly coded or malicious user can do with LOCK TABLE privileges is insignificant next to the damage they can do with more UPDATE privileges than they really need. Right now, we're basically encouraging admins to grant full-table update privileges when that's not really necessary. If, in the future, Postgres supports the ability to LOCK TABLE only on specific columns, I think we could refine this permissions check so that column-level update privileges only allowed the user to lock those columns. But I think this patch is a step in the right direction. Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Return command tag 'REPLACE X' for CREATE OR REPLACE statements.
Marti Raudsepp ma...@juffo.org writes: This patch returns command tag CREATE X or REPLACE X for LANGAUGE/VIEW/RULE/FUNCTION. This is done by passing completionTag to from ProcessUtility to more functions, and adding a 'bool *didUpdate' argument to some lower-level functions. I'm not sure if passing back the status in a bool* is considered good style, but this way all the functions look consistent. This is going to break clients that expect commands to return the same command tag as they have in the past. I doubt that whatever usefulness is gained will outweigh the compatibility problems. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] column-level update privs + lock table
On Fri, 2010-11-26 at 19:11 -0500, Robert Haas wrote: 2010/11/25 KaiGai Kohei kai...@ak.jp.nec.com: (2010/10/16 4:49), Josh Kupershmidt wrote: [Moving to -hackers] On Fri, Oct 15, 2010 at 3:43 AM, Simon Riggssi...@2ndquadrant.com wrote: On Mon, 2010-10-11 at 09:41 -0400, Josh Kupershmidt wrote: On Thu, Oct 7, 2010 at 7:43 PM, Josh Kupershmidtschmi...@gmail.com wrote: I noticed that granting a user column-level update privileges doesn't allow that user to issue LOCK TABLE with any mode other than Access Share. Anyone think this could be added as a TODO? Seems so to me, but you raise on Hackers. Thanks, Simon. Attached is a simple patch to let column-level UPDATE privileges allow a user to LOCK TABLE in a mode higher than Access Share. Small doc. update and regression test update are included as well. Feedback is welcome. I checked your patch, then I'd like to mark it as ready for committer. The point of this patch is trying to solve an incompatible behavior between SELECT ... FOR SHARE/UPDATE and LOCK command. On ExecCheckRTEPerms(), it allows the required accesses when no columns are explicitly specified in the query and the current user has necessary privilege on one of columns within the target relation. If we stand on the perspective that LOCK command should take same privileges with the case when we use SELECT ... FOR SHARE/UPDATE without specifying explicit columns, like COUNT(*), the existing LOCK command seems to me odd. I think this patch fixes the behavior as we expected. I'm not totally convinced that this is the correct behavior. It seems a bit surprising that UPDATE privilege on a single column is enough to lock out all SELECT activity from the table. It's actually a bit surprising that even full-table UPDATE privileges are enough to do this, but this change would allow people to block access to data they can neither see nor modify. That seems counterintuitive, if not a security hole. This comment misses the point. A user can already lock every row of a table, if they choose, by issuing SELECT ... FOR SHARE/UPDATE, if they have update rights on a single column. So the patch does not increase the rights of the user, it merely allows it to happen in a rational way and in a way that makes SELECT and LOCK work the same. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG question about PREPARE and EXECUTE
On Wed, Nov 10, 2010 at 11:44:52AM +0100, Boszormenyi Zoltan wrote: a question came to us in the form of a code example, which I shortened. Say, we have this structure: ... Any comment on why it isn't done? Missing feature. Originally the pure text based statement copying wasn't able to cope with these and then it simply wasn't implemented when we went to server side prepares, that is if my memory serves well. The same feature is implemented for cursor declaration/open I think. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: On Sat, Nov 27, 2010 at 11:18 PM, Bruce Momjian br...@momjian.us wrote: Not sure that information moves us forward. If the postmaster cleared the memory, we would have COW in the child and probably be even slower. Well, we can determine the answers to these questions empirically. Not really. Per Bruce's description, a page would become COW the moment the postmaster touched (either write or read) any variable on it. Since we have no control over how the loader lays out static variables, the actual behavior of a particular build would be pretty random and subject to unexpected changes caused by seemingly unrelated edits. Also, the referenced URL only purports to describe the behavior of HPUX, which is not exactly a mainstream OS. I think it requires a considerable leap of faith to assume that all or even most platforms work the way this suggests, and not in the dumber fashion Andres suggested. Has anybody here actually looked at the relevant Linux or BSD kernel code? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PLy_malloc and plperl mallocs
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: I'll do that for PL/Python for now. While on the topic of needless FATAL errors, if you try to create a Python 3 function in a session that already loaded Python 2, you get a FATAL error with an errhint of Start a new session to use a different Python major version. If you raise a FATAL error, you don't really have much choice than to start a new session, since the current one just got nuked, no? Should this be an ERROR? I believe we decided that it had to be FATAL because the session could no longer be trusted to execute functions of the other python version either. Check the archives from when that patch was put in. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report: Linux huge pages with Postgres
On Sat, 2010-11-27 at 14:27 -0500, Tom Lane wrote: This is discouraging; it certainly doesn't make me want to expend the effort to develop a production patch. Perhaps. Why do this only for shared memory? Surely the majority of memory accesses are to private memory, so being able to allocate private memory in a single huge page would be better for avoiding TLB cache misses. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report: Linux huge pages with Postgres
Simon Riggs si...@2ndquadrant.com writes: On Sat, 2010-11-27 at 14:27 -0500, Tom Lane wrote: This is discouraging; it certainly doesn't make me want to expend the effort to develop a production patch. Perhaps. Why do this only for shared memory? There's no exposed API for causing a process's regular memory to become hugepages. Surely the majority of memory accesses are to private memory, so being able to allocate private memory in a single huge page would be better for avoiding TLB cache misses. It's not really about the number of memory accesses, it's about the number of TLB entries needed. Private memory is generally a lot smaller than shared, in a tuned PG installation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report: Linux huge pages with Postgres
On Sun, 2010-11-28 at 12:04 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On Sat, 2010-11-27 at 14:27 -0500, Tom Lane wrote: This is discouraging; it certainly doesn't make me want to expend the effort to develop a production patch. Perhaps. Why do this only for shared memory? There's no exposed API for causing a process's regular memory to become hugepages. We could make all the palloc stuff into shared memory also (private shared memory that is). We're not likely to run out of 64-bit memory addresses any time soon. Surely the majority of memory accesses are to private memory, so being able to allocate private memory in a single huge page would be better for avoiding TLB cache misses. It's not really about the number of memory accesses, it's about the number of TLB entries needed. Private memory is generally a lot smaller than shared, in a tuned PG installation. Sure, but 4MB of memory is enough to require 1000 TLB entries, which is more than enough to blow the TLB even on a Nehalem. So the size of the memory we access is already big enough to blow the cache, even without shared buffers. If the majority of accesses are from private memory then the TLB cache will already be thrashed by the time we access shared buffers again. That is at least one possible explanation for the lack of benefit. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report: Linux huge pages with Postgres
Simon Riggs si...@2ndquadrant.com writes: On Sun, 2010-11-28 at 12:04 -0500, Tom Lane wrote: There's no exposed API for causing a process's regular memory to become hugepages. We could make all the palloc stuff into shared memory also (private shared memory that is). We're not likely to run out of 64-bit memory addresses any time soon. Mph. It's still not going to work well enough to be useful, because the kernel design for hugepages assumes a pretty static number of them. That maps well to our use of shared memory, not at all well to process local memory. Sure, but 4MB of memory is enough to require 1000 TLB entries, which is more than enough to blow the TLB even on a Nehalem. That can't possibly be right. I'm sure the chip designers have heard of programs using more than 4MB. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Sun, Nov 28, 2010 at 11:41 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Nov 27, 2010 at 11:18 PM, Bruce Momjian br...@momjian.us wrote: Not sure that information moves us forward. If the postmaster cleared the memory, we would have COW in the child and probably be even slower. Well, we can determine the answers to these questions empirically. Not really. Per Bruce's description, a page would become COW the moment the postmaster touched (either write or read) any variable on it. Since we have no control over how the loader lays out static variables, the actual behavior of a particular build would be pretty random and subject to unexpected changes caused by seemingly unrelated edits. Well, one big character array pretty much has to be laid out contiguously, and it would be pretty surprising (but not entirely impossible) to find that the linker randomly sprinkles symbols from other files in between consecutive definitions in the same source file. I think the next question to answer is to try to allocate blame for the memset/memcpy overhead between page faults and the zeroing itself. That seems like something we can easily member by writing a test program that zeroes the same region twice and kicks out timing numbers. If, as you and Andres are arguing, the actual zeroing is minor, then we can forget this whole line of discussion and move on to other possible optimizations. If that turns out not to be true then we can worry about how best to avoid the zeroing. I have to believe that's a solvable problem; the question is whether there's a benefit. In a close race, I don't think we should get bogged down in micro-optimization here, both because micro-optimizations may not gain much and because what works well on one platform may not do much at all on another. The more general issue here is what to do about our high backend startup costs. Beyond trying to recycle backends for new connections, as I've previous proposed and with all the problems it entails, the only thing that looks promising here is to try to somehow cut down on the cost of populating the catcache and relcache, not that I have a very clear idea how to do that. This has to be a soluble problem because other people have solved it. To some degree we're a victim of our own flexible and extensible architecture here, but I find it pretty unsatisfying to just say, OK, well, we're slow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report: Linux huge pages with Postgres
On Sun, Nov 28, 2010 at 02:32:04PM -0500, Tom Lane wrote: Sure, but 4MB of memory is enough to require 1000 TLB entries, which is more than enough to blow the TLB even on a Nehalem. That can't possibly be right. I'm sure the chip designers have heard of programs using more than 4MB. According to http://www.realworldtech.com/page.cfm?ArticleID=RWT040208182719p=8 on the Core 2 chip there wasn't even enough TLB to cover the entire onboard cache. With Nehalem there are 2304 TLB entries on the chip, which cover at least the whole onboard cache, but only just. Memory access is expensive. I think if you got good statistics on how much time your CPU is waiting for memory it'd be pretty depressing. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Patriotism is when love of your own people comes first; nationalism, when hate for people other than your own comes first. - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] Rethinking representation of sort/hash semantics in queries and plans
Tom Lane t...@sss.pgh.pa.us writes: If you look closely at what we're doing with sort operators (get_ordering_op_properties pretty much encapsulates this), it turns out that a sort operator is shorthand for three pieces of information: 1. btree opfamily OID 2. specific input datatype for the opfamily 3. ascending or descending direction So to fix these problems we'd need to replace sort operator OIDs in SortGroupClause and plan nodes with those three items. Obviously, this would be slightly bulkier, but the extra cost added to copying parse and plan trees should be tiny compared to the avoided syscache lookups. A possible compromise is to avoid storing the specific input datatype. My understanding is that opfamily+datatype gives an opclass. What about storing the opclass OID there? Other than that, cleaning up the current situation by having as good an view of the bigger picture as what you have now sounds great. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas wrote: On Sat, Nov 27, 2010 at 11:18 PM, Bruce Momjian br...@momjian.us wrote: Not sure that information moves us forward. ?If the postmaster cleared the memory, we would have COW in the child and probably be even slower. Well, we can determine the answers to these questions empirically. I think some more scrutiny of the code with the points you and Andres and Tom have raised is probably in order, and probably some more benchmarking, too. I haven't had a chance to do that yet, however. Basically, my bet is if you allocated a large zero-data variable in the postmaster but never accessed it from the postmaster, at most you would copy-on-write (COW) fault in two page, one at the beginning that is shared by accessed variables, and one at the end. The remaining pages (4k default for x86) would be zero-filled and not COW shared. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: The more general issue here is what to do about our high backend startup costs. Beyond trying to recycle backends for new connections, as I've previous proposed and with all the problems it entails, the only thing that looks promising here is to try to somehow cut down on the cost of populating the catcache and relcache, not that I have a very clear idea how to do that. One comment to make here is that it would be a serious error to focus on the costs of just starting and stopping a backend; you have to think about cases where the backend does at least some useful work in between, and that means actually *populating* those caches (to some extent) not just initializing them. Maybe your wording above was chosen with that in mind, but I think onlookers might easily overlook the point. FWIW, today I've been looking into getting rid of the silliness in build_index_pathkeys whereby it reconstructs pathkey opfamily OIDs from sortops instead of just using the index opfamilies directly. It turns out that once you fix that, there is no need at all for relcache to cache per-index operator data (the rd_operator arrays) because that's the only code that uses 'em. I don't see any particular change in the runtime of the regression tests from ripping out that part of the cached data, but it ought to have at least some beneficial effect on real startup time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rethinking representation of sort/hash semantics in queries and plans
Dimitri Fontaine dimi...@2ndquadrant.fr writes: Tom Lane t...@sss.pgh.pa.us writes: So to fix these problems we'd need to replace sort operator OIDs in SortGroupClause and plan nodes with those three items. Obviously, this would be slightly bulkier, but the extra cost added to copying parse and plan trees should be tiny compared to the avoided syscache lookups. My understanding is that opfamily+datatype gives an opclass. What about storing the opclass OID there? Then you'd just need to look up the opfamily again. Opclasses are too small a division to be useful. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] column-level update privs + lock table
On Sun, Nov 28, 2010 at 11:35 AM, Simon Riggs si...@2ndquadrant.com wrote: On Fri, 2010-11-26 at 19:11 -0500, Robert Haas wrote: 2010/11/25 KaiGai Kohei kai...@ak.jp.nec.com: (2010/10/16 4:49), Josh Kupershmidt wrote: [Moving to -hackers] On Fri, Oct 15, 2010 at 3:43 AM, Simon Riggssi...@2ndquadrant.com wrote: On Mon, 2010-10-11 at 09:41 -0400, Josh Kupershmidt wrote: On Thu, Oct 7, 2010 at 7:43 PM, Josh Kupershmidtschmi...@gmail.com wrote: I noticed that granting a user column-level update privileges doesn't allow that user to issue LOCK TABLE with any mode other than Access Share. Anyone think this could be added as a TODO? Seems so to me, but you raise on Hackers. Thanks, Simon. Attached is a simple patch to let column-level UPDATE privileges allow a user to LOCK TABLE in a mode higher than Access Share. Small doc. update and regression test update are included as well. Feedback is welcome. I checked your patch, then I'd like to mark it as ready for committer. The point of this patch is trying to solve an incompatible behavior between SELECT ... FOR SHARE/UPDATE and LOCK command. On ExecCheckRTEPerms(), it allows the required accesses when no columns are explicitly specified in the query and the current user has necessary privilege on one of columns within the target relation. If we stand on the perspective that LOCK command should take same privileges with the case when we use SELECT ... FOR SHARE/UPDATE without specifying explicit columns, like COUNT(*), the existing LOCK command seems to me odd. I think this patch fixes the behavior as we expected. I'm not totally convinced that this is the correct behavior. It seems a bit surprising that UPDATE privilege on a single column is enough to lock out all SELECT activity from the table. It's actually a bit surprising that even full-table UPDATE privileges are enough to do this, but this change would allow people to block access to data they can neither see nor modify. That seems counterintuitive, if not a security hole. This comment misses the point. A user can already lock every row of a table, if they choose, by issuing SELECT ... FOR SHARE/UPDATE, if they have update rights on a single column. So the patch does not increase the rights of the user, it merely allows it to happen in a rational way and in a way that makes SELECT and LOCK work the same. Locking every row of the table allows a user with UPDATE privileges to block all current UPDATE and DELETE statements, but it won't necessarily block INSERT statements (depending on unique indices) and it certainly won't block SELECT statements. This patch proposes to allow a user with update privileges on a single column to lock out ALL concurrent activity, reads and writes. So it is not by any definition making SELECT and LOCK work the same. What it IS doing is making column-level update permissions and table-level update permissions work the same way. After all, one might argue, if full-table update permissions allow a user to take an access exclusive lock, why not single-column update permissions? I think, though, that there is a reasonable argument to be made that a user who has been given UPDATE privileges on the entire table contents is more trusted than one who has privileges only on certain columns. The first user can presumably trash the entire table contents if he so desires; the second one can't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] SSI using rw-conflict lists
Well, it's been a productive holiday weekend. I've completed the switch of the SSI implementation from one conflict pointer in and one conflict pointer out per transaction to a list of conflicts between transactions. This eliminated all false positives in my dtester suite. The only test which had still had any was the one based on the oft-cited receipting example, which had 210 permutations of how the statements could be interleaved, of which six actually created an anomaly; but about half the others were false positives because of the limits of the conflict pointers. For those following along, the initial (2008 ACM SIGMOD) paper on SSI used booleans to flag conflicts. Cahill's 2009 doctoral work improved the false positive rate by switching these to pointers. The obvious next step for anyone competent in the field was to go from simple pointers to lists. That was in my mind as a useful optimization from the beginning, but was brought to a head by Heikki and Jeff with their concerns about memory management -- I couldn't see a good way to fix that without first implementing the lists. Along the way I also implemented more aggressive clean-up of memory resources, including immediate complete clean-up of a transaction which is rolled back. The patch is now also poised to use Andres Freund's IDLE IN TRANSACTION cancellation code in such a way that SSI can guarantee that even the most pessimal load will not thrash to the point of no progress -- when this is done, some transaction which wrote data must successfully commit before transactions concurrent to it can be cancelled. These nice features required the conflict list. I need some advice, though. I had somehow had it in my head that fields not declared static had a shared copy among all PostgreSQL backends. When things didn't behave properly, I discovered the error of my ways, and moved them to a shared memory structure used by the SSI code, just to get things working. It looks kinda ugly at the moment, and I'm not sure what would be considered best practices to clean it up. (1) Should these be moved to ShmemVariableCache, or is it OK to leave them in this structure as long as I comment it adequately? (2) Would it be a good idea to create macros for referencing these fields? The current references are long and distracting to read, and would all need to change if the fields are moved to a different shared memory structure. The relevant commit diff is here: http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=bc3aba45e192afcd7776c68e28438991a3406db6 There's another issue involving clarity of code, versus correct behavior. When I got to the last few false positives, I found that they could only be eliminated by tracking one more datum for a transaction committed with a conflict out to another transaction. There was a field which was unused in the particular situations where we needed this new datum, so I made it work with the dual use of the existing field. (3) What would be the cleanest way to conditionally store values representing different things in one field of a structure, so that someone reading the code understands what's happening? The commit diff for this is here: http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=9425d7fa551a1433bdbec1de5cfb1bfa9f43da22 As always, any other tips or criticisms (or encouragement!) are welcome. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rethinking representation of sort/hash semantics in queries and plans
I wrote: (For some extra amusement, trace through where build_index_pathkeys' data comes from...) While I don't propose to implement right away the whole SortGroupClause and plan tree modification sketched above, I did look into fixing build_index_pathkeys so that it doesn't uselessly convert from opfamilies to sort operators and back again. The main reason this is relevant at the moment is that we could get rid of relcache.c's caching of operators related to indexes, which seems possibly useful in connection with the current discussion of backend startup time. What becomes apparent when you look closely at what that code is doing is that it's catering to the possibility of an amcanorder index access method that isn't btree. As things stand in HEAD, an add-on index access method would be recognized as supporting ordering so long as it registers the regular comparison operators (, , etc) with the same index strategy numbers as btree. The reason that it would work is that those operators would be found as the fwdsortop/revsortop entries for the index, and then looking up those operator OIDs in btree opfamilies would locate the corresponding btree opfamily OIDs, which is what you have to have to match to a pathkey's opfamily. In the attached draft patch that would no longer work, because the new access method would have to have the exact same opfamily OIDs as btree in order to match to btree-derived pathkeys --- and of course it can't have that, since access method is a property of an opfamily. Now, this loss of flexibility doesn't particularly bother me, because I know of no existing or contemplated btree-substitute access methods. If one did appear on the horizon, there are a couple of ways we could fix the problem, the cleanest being to let a non-btree opfamily declare that it sorts the same as btree opfamily so-and-so. Or we could fix it locally in plancat.c by performing the lookup-the-operators-and- then-the-btree-opfamilies dance on the fly when setting up IndexOptInfo for a non-btree amcanorder index. But I'm disinclined to write such code when there's no way to test it and no foreseeable possibility that it'll ever be used. Maybe we should just make plancat.c throw a not-implemented error if amcanorder is true but it's not btree. Thoughts? Anyone particularly opposed to pursuing an optimization of this kind at all? regards, tom lane PS: the attached patch doesn't yet include removal of relcache rd_operator arrays, since that would just make the patch bigger without exposing any new issues. diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index d8fc12068fb6113c0baf8aa4e5941e0150ce3521..f73e0e6dc6007ce768828480f74621752aaf57d2 100644 *** a/src/backend/optimizer/path/indxpath.c --- b/src/backend/optimizer/path/indxpath.c *** find_usable_indexes(PlannerInfo *root, R *** 380,386 * how many of them are actually useful for this query. This is not * relevant unless we are at top level. */ ! index_is_ordered = OidIsValid(index-fwdsortop[0]); if (index_is_ordered possibly_useful_pathkeys istoplevel outer_rel == NULL) { --- 380,386 * how many of them are actually useful for this query. This is not * relevant unless we are at top level. */ ! index_is_ordered = (index-sortopfamily != NULL); if (index_is_ordered possibly_useful_pathkeys istoplevel outer_rel == NULL) { diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c index 8af0c6dc482372244b15a8a5f5a4a562e34ac91b..6325655eed84759168a51aed3f712582ec9c1f00 100644 *** a/src/backend/optimizer/path/pathkeys.c --- b/src/backend/optimizer/path/pathkeys.c *** static PathKey *make_canonical_pathkey(P *** 36,47 EquivalenceClass *eclass, Oid opfamily, int strategy, bool nulls_first); static bool pathkey_is_redundant(PathKey *new_pathkey, List *pathkeys); - static PathKey *make_pathkey_from_sortinfo(PlannerInfo *root, - Expr *expr, Oid ordering_op, - bool nulls_first, - Index sortref, - bool create_it, - bool canonicalize); static Var *find_indexkey_var(PlannerInfo *root, RelOptInfo *rel, AttrNumber varattno); static bool right_merge_direction(PlannerInfo *root, PathKey *pathkey); --- 36,41 *** canonicalize_pathkeys(PlannerInfo *root, *** 224,232 /* * make_pathkey_from_sortinfo ! * Given an expression, a sortop, and a nulls-first flag, create ! * a PathKey. If canonicalize = true, the result is a canonical ! * PathKey, otherwise not. (But note it might be redundant anyway.) * * If the PathKey is being generated from a SortGroupClause, sortref should be * the SortGroupClause's SortGroupRef; otherwise zero. --- 218,226 /* * make_pathkey_from_sortinfo ! * Given an expression and sort-order information, create
Re: [HACKERS] Report: Linux huge pages with Postgres
On Sat, Nov 27, 2010 at 02:27:12PM -0500, Tom Lane wrote: We've gotten a few inquiries about whether Postgres can use huge pages under Linux. In principle that should be more efficient for large shmem regions, since fewer TLB entries are needed to support the address space. I spent a bit of time today looking into what that would take. My testing was done with current Fedora 13, kernel version 2.6.34.7-61.fc13.x86_64 --- it's possible some of these details vary across other kernel versions. You can test this with fairly minimal code changes, as illustrated in the attached not-production-grade patch. To select huge pages we have to include SHM_HUGETLB in the flags for shmget(), and we have to be prepared for failure (due to permissions or lack of allocated hugepages). I made the code just fall back to a normal shmget on failure. A bigger problem is that the shmem request size must be a multiple of the system's hugepage size, which is *not* a constant even though the test patch just uses 2MB as the assumed value. For a production-grade patch we'd have to scrounge the active value out of someplace in the /proc filesystem (ick). I would expect that you can just iterate through the size possibilities pretty quickly and just use the first one that works -- no /proc groveling. In addition to the code changes there are a couple of sysadmin requirements to make huge pages available to Postgres: 1. You have to configure the Postgres user as a member of the group that's permitted to allocate hugepage shared memory. I did this: sudo sh -c id -g postgres /proc/sys/vm/hugetlb_shm_group For production use you'd need to put this in the PG initscript, probably, to ensure it gets re-set after every reboot and before PG is started. Since it would take advantage of them automatically, this would be just a normal DBA/admin task. 2. You have to manually allocate some huge pages --- there doesn't seem to be any setting that says just give them out on demand. I did this: sudo sh -c echo 600 /proc/sys/vm/nr_hugepages which gave me a bit over 1GB of space reserved as huge pages. Again, this'd have to be done over again at each system boot. Same. For testing purposes, I figured that what I wanted to stress was postgres process swapping and shmem access. I built current git HEAD with --enable-debug and no other options, and tested with these non-default settings: shared_buffers 1GB checkpoint_segments 50 fsyncoff (fsync intentionally off since I'm not trying to measure disk speed). The test machine has two dual-core Nehalem CPUs. Test case is pgbench at -s 25; I ran several iterations of pgbench -c 10 -T 60 bench in each configuration. And the bottom line is: if there's any performance benefit at all, it's on the order of 1%. The best result I got was about 3200 TPS with hugepages, and about 3160 without. The noise in these numbers is more than 1% though. This is discouraging; it certainly doesn't make me want to expend the effort to develop a production patch. However, perhaps someone else can try to show a greater benefit under some other test conditions. regards, tom lane I would not really expect to see much benefit in the region that the normal TLB page size would cover with the typical number of TLB entries. 1GB of shared buffers would not be enough to cause TLB thrashing with most processors. Bump it to 8-32GB or more and if the queries use up TLB entries with local work_mem you should see some more value in the patch. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib: auth_delay module
On Sun, Nov 28, 2010 at 5:38 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Nov 27, 2010 at 2:44 PM, Jeff Janes jeff.ja...@gmail.com wrote: I haven' t thought of a way to test this, so I guess I'll just ask. If the attacking client just waits a few milliseconds for a response and then drops the socket, opening a new one, will the server-side walking-dead process continue to be charged against max_connections until it's sleep expires? I'm not sure, either. I suspect the answer is yes. I guess you could test this by writing a loop like this: while true; do psql connection parameters that will fail authentication; done ...and then hitting ^C every few seconds during execution. After doing that for a bit, run select * from pg_stat_activity or ps auxww | grep postgres in another window. Right, I didn't think of using psql, I thought I'd have to wrangle my own socket code. I wrote up a perl script that spawns psql and immediately kills it. I quickly start getting psql: FATAL: sorry, too many clients already errors. And that condition doesn't clear until the sleep expires on the earliest ones spawned. So it looks like the max_connections is charged until the auth_delay expires. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Sun, Nov 28, 2010 at 3:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The more general issue here is what to do about our high backend startup costs. Beyond trying to recycle backends for new connections, as I've previous proposed and with all the problems it entails, the only thing that looks promising here is to try to somehow cut down on the cost of populating the catcache and relcache, not that I have a very clear idea how to do that. One comment to make here is that it would be a serious error to focus on the costs of just starting and stopping a backend; you have to think about cases where the backend does at least some useful work in between, and that means actually *populating* those caches (to some extent) not just initializing them. Maybe your wording above was chosen with that in mind, but I think onlookers might easily overlook the point. I did have that in mind, but I agree the point is worth mentioning. So, for example, it wouldn't gain anything meaningful for us to postpone catcache initialization until someone executes a query. It would improve the synthetic benchmark, but that's it. FWIW, today I've been looking into getting rid of the silliness in build_index_pathkeys whereby it reconstructs pathkey opfamily OIDs from sortops instead of just using the index opfamilies directly. It turns out that once you fix that, there is no need at all for relcache to cache per-index operator data (the rd_operator arrays) because that's the only code that uses 'em. I don't see any particular change in the runtime of the regression tests from ripping out that part of the cached data, but it ought to have at least some beneficial effect on real startup time. Wow. that's great. The fact that it simplifies the code is probably the main point, but obviously whatever cycles we can save during startup (and ongoing operation) are all to the good. One possible way to get a real speedup here would be to look for ways to trim the number of catcaches. But I'm not too convinced there's much water to squeeze out of that rock. After our recent conversation about KNNGIST, it occurred to me to wonder whether there's really any point in pretending that a user can usefully add an AM, both due to hard-wired planner knowledge and due to lack of any sort of extensible XLOG support. If not, we could potentially turn pg_am into a hardcoded lookup table rather than a modifiable catalog, which would also likely be faster; and perhaps reference AMs elsewhere with characters rather than OIDs. But even if this were judged a sensible thing to do I'm not very sure that even a purpose-built synthetic benchmark would be able to measure the speedup. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: After our recent conversation about KNNGIST, it occurred to me to wonder whether there's really any point in pretending that a user can usefully add an AM, both due to hard-wired planner knowledge and due to lack of any sort of extensible XLOG support. If not, we could potentially turn pg_am into a hardcoded lookup table rather than a modifiable catalog, which would also likely be faster; and perhaps reference AMs elsewhere with characters rather than OIDs. But even if this were judged a sensible thing to do I'm not very sure that even a purpose-built synthetic benchmark would be able to measure the speedup. Well, the lack of extensible XLOG support is definitely a big handicap to building a *production* index AM as an add-on. But it's not such a handicap for development. And I don't believe that the planner is hardwired in any way that doesn't allow new index types. GIST and GIN have both been added successfully without kluging the planner. It does know a lot more about btree than other index types, but that doesn't mean you can't add a new index type that doesn't behave like btree; that's more reflective of where development effort has been spent. So I would consider the above idea a step backwards, and I doubt it would save anything meaningful anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] column-level update privs + lock table
(2010/11/27 9:11), Robert Haas wrote: 2010/11/25 KaiGai Koheikai...@ak.jp.nec.com: (2010/10/16 4:49), Josh Kupershmidt wrote: [Moving to -hackers] On Fri, Oct 15, 2010 at 3:43 AM, Simon Riggssi...@2ndquadrant.com wrote: On Mon, 2010-10-11 at 09:41 -0400, Josh Kupershmidt wrote: On Thu, Oct 7, 2010 at 7:43 PM, Josh Kupershmidtschmi...@gmail.com wrote: I noticed that granting a user column-level update privileges doesn't allow that user to issue LOCK TABLE with any mode other than Access Share. Anyone think this could be added as a TODO? Seems so to me, but you raise on Hackers. Thanks, Simon. Attached is a simple patch to let column-level UPDATE privileges allow a user to LOCK TABLE in a mode higher than Access Share. Small doc. update and regression test update are included as well. Feedback is welcome. I checked your patch, then I'd like to mark it as ready for committer. The point of this patch is trying to solve an incompatible behavior between SELECT ... FOR SHARE/UPDATE and LOCK command. On ExecCheckRTEPerms(), it allows the required accesses when no columns are explicitly specified in the query and the current user has necessary privilege on one of columns within the target relation. If we stand on the perspective that LOCK command should take same privileges with the case when we use SELECT ... FOR SHARE/UPDATE without specifying explicit columns, like COUNT(*), the existing LOCK command seems to me odd. I think this patch fixes the behavior as we expected. I'm not totally convinced that this is the correct behavior. It seems a bit surprising that UPDATE privilege on a single column is enough to lock out all SELECT activity from the table. It's actually a bit surprising that even full-table UPDATE privileges are enough to do this, but this change would allow people to block access to data they can neither see nor modify. That seems counterintuitive, if not a security hole. In my understanding, UPDATE privilege on a single column also allows lock out concurrent updating even if this query tries to update rows partially. Therefore, the current code considers UPDATE privilege on a single column is enough to lock out the table. Right? My comment was from a standpoint which wants consistent behavior between SELECT ... FOR and LOCK command. If we concerned about this behavior, ExecCheckRTEPerms() might be a place where we also should fix. Thanks, -- KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib: auth_delay module
On Sun, Nov 28, 2010 at 5:41 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Sun, Nov 28, 2010 at 5:38 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Nov 27, 2010 at 2:44 PM, Jeff Janes jeff.ja...@gmail.com wrote: I haven' t thought of a way to test this, so I guess I'll just ask. If the attacking client just waits a few milliseconds for a response and then drops the socket, opening a new one, will the server-side walking-dead process continue to be charged against max_connections until it's sleep expires? I'm not sure, either. I suspect the answer is yes. I guess you could test this by writing a loop like this: while true; do psql connection parameters that will fail authentication; done ...and then hitting ^C every few seconds during execution. After doing that for a bit, run select * from pg_stat_activity or ps auxww | grep postgres in another window. Right, I didn't think of using psql, I thought I'd have to wrangle my own socket code. I wrote up a perl script that spawns psql and immediately kills it. I quickly start getting psql: FATAL: sorry, too many clients already errors. And that condition doesn't clear until the sleep expires on the earliest ones spawned. So it looks like the max_connections is charged until the auth_delay expires. Yeah. Avoiding that would be hard, and it's not clear that there's any demand. The demand for doing this much seems a bit marginal too, but there were several people who seemed to think it worth committing, so I did. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Sun, Nov 28, 2010 at 6:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: After our recent conversation about KNNGIST, it occurred to me to wonder whether there's really any point in pretending that a user can usefully add an AM, both due to hard-wired planner knowledge and due to lack of any sort of extensible XLOG support. If not, we could potentially turn pg_am into a hardcoded lookup table rather than a modifiable catalog, which would also likely be faster; and perhaps reference AMs elsewhere with characters rather than OIDs. But even if this were judged a sensible thing to do I'm not very sure that even a purpose-built synthetic benchmark would be able to measure the speedup. Well, the lack of extensible XLOG support is definitely a big handicap to building a *production* index AM as an add-on. But it's not such a handicap for development. Realistically, it's hard for me to imagine that anyone would go to the trouble of building it as a loadable module first and then converting it to part of core later on. That'd hardly be less work. And I don't believe that the planner is hardwired in any way that doesn't allow new index types. GIST and GIN have both been added successfully without kluging the planner. We have 9 boolean flags to indicate the capabilities (or lack thereof) of AMs, and we only have 4 AMs. It seems altogether plausible to assume that the next AM we add could require flags 10 and 11. Heck, I think KNNGIST is going to require another flag... which will likely never be set for any AM other than GIST. It does know a lot more about btree than other index types, but that doesn't mean you can't add a new index type that doesn't behave like btree; that's more reflective of where development effort has been spent. So I would consider the above idea a step backwards, and I doubt it would save anything meaningful anyway. That latter point, as far as I'm concerned, is the real nail in the coffin. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib: auth_delay module
On Sun, Nov 28, 2010 at 3:57 PM, Robert Haas robertmh...@gmail.com wrote: On Sun, Nov 28, 2010 at 5:41 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Sun, Nov 28, 2010 at 5:38 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Nov 27, 2010 at 2:44 PM, Jeff Janes jeff.ja...@gmail.com wrote: I haven' t thought of a way to test this, so I guess I'll just ask. If the attacking client just waits a few milliseconds for a response and then drops the socket, opening a new one, will the server-side walking-dead process continue to be charged against max_connections until it's sleep expires? I'm not sure, either. I suspect the answer is yes. I guess you could test this by writing a loop like this: while true; do psql connection parameters that will fail authentication; done ...and then hitting ^C every few seconds during execution. After doing that for a bit, run select * from pg_stat_activity or ps auxww | grep postgres in another window. Right, I didn't think of using psql, I thought I'd have to wrangle my own socket code. I wrote up a perl script that spawns psql and immediately kills it. I quickly start getting psql: FATAL: sorry, too many clients already errors. And that condition doesn't clear until the sleep expires on the earliest ones spawned. So it looks like the max_connections is charged until the auth_delay expires. Yeah. Avoiding that would be hard, and it's not clear that there's any demand. The demand for doing this much seems a bit marginal too, but there were several people who seemed to think it worth committing, so I did. Oh, I wasn't complaining. I think that having max_connections be charged for the duration even if the socket is dropped is the only reasonable thing to do, and wanted to verify that it did happen. Otherwise the module wouldn't do a very good job at its purpose, the attacker would simply wait a few milliseconds and then assume it got the wrong password and kill the connection and start new one. Preventing the brute force password attack by shunting it into a DOS attack instead seems reasonable. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib: auth_delay module
On Sun, Nov 28, 2010 at 7:10 PM, Jeff Janes jeff.ja...@gmail.com wrote: Oh, I wasn't complaining. I think that having max_connections be charged for the duration even if the socket is dropped is the only reasonable thing to do, and wanted to verify that it did happen. Otherwise the module wouldn't do a very good job at its purpose, the attacker would simply wait a few milliseconds and then assume it got the wrong password and kill the connection and start new one. Good point. Preventing the brute force password attack by shunting it into a DOS attack instead seems reasonable. OK. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report: Linux huge pages with Postgres
Kenneth Marshall k...@rice.edu writes: On Sat, Nov 27, 2010 at 02:27:12PM -0500, Tom Lane wrote: ... A bigger problem is that the shmem request size must be a multiple of the system's hugepage size, which is *not* a constant even though the test patch just uses 2MB as the assumed value. For a production-grade patch we'd have to scrounge the active value out of someplace in the /proc filesystem (ick). I would expect that you can just iterate through the size possibilities pretty quickly and just use the first one that works -- no /proc groveling. It's not really that easy, because (at least on the kernel version I tested) it's not the shmget that fails, it's the later shmat. Releasing and reacquiring the shm segment would require significant code restructuring, and at least on some platforms could produce weird failure cases --- I seem to recall having heard of kernels where the release isn't instantaneous, so that you could run up against SHMMAX for no apparent reason. Really you do want to scrape the value. 2. You have to manually allocate some huge pages --- there doesn't seem to be any setting that says just give them out on demand. I did this: sudo sh -c echo 600 /proc/sys/vm/nr_hugepages which gave me a bit over 1GB of space reserved as huge pages. Again, this'd have to be done over again at each system boot. Same. The fact that hugepages have to be manually managed, and that any unaccounted-for represent completely wasted RAM, seems like a pretty large PITA to me. I don't see anybody buying into that for gains measured in single-digit percentages. 1GB of shared buffers would not be enough to cause TLB thrashing with most processors. Well, bigger cases would be useful to try, although Simon was claiming that the TLB starts to fall over at 4MB of working set. I don't have a large enough machine to try the sort of test you're suggesting, so if anyone thinks this is worth pursuing, there's the patch ... go test it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: One possible way to get a real speedup here would be to look for ways to trim the number of catcaches. BTW, it's not going to help to remove catcaches that have a small initial size, as the pg_am cache certainly does. If the bucket zeroing cost is really something to minimize, it's only the caches with the largest nbuckets counts that are worth considering --- and we certainly can't remove those without penalty. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Return command tag 'REPLACE X' for CREATE OR REPLACE statements.
On Sun, Nov 28, 2010 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: Marti Raudsepp ma...@juffo.org writes: This patch returns command tag CREATE X or REPLACE X for LANGAUGE/VIEW/RULE/FUNCTION. This is done by passing completionTag to from ProcessUtility to more functions, and adding a 'bool *didUpdate' argument to some lower-level functions. I'm not sure if passing back the status in a bool* is considered good style, but this way all the functions look consistent. This is going to break clients that expect commands to return the same command tag as they have in the past. I doubt that whatever usefulness is gained will outweigh the compatibility problems. You complained about this when we changed the SELECT tag for 9.0 to include row-counts for CTAS etc. where it hadn't before. Have we gotten any complaints about that change breaking clients? I think more expessive command tags are in general a good thing. The idea that this particular change would be useful primarily for humans examining the psql output seems a bit weak to me, but I can easily see it being useful for programs. Right now a program has no reason to look at this command tag anyway; it'll always be the same. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
BTW, this might be premature to mention pending some tests about mapping versus zeroing overhead, but it strikes me that there's more than one way to skin a cat. I still think the idea of statically allocated space sucks. But what if we rearranged things so that palloc0 doesn't consist of palloc-then-memset, but rather push the zeroing responsibility down into the allocator? In particular, I'm imagining that palloc0 with a sufficiently large space request --- more than a couple pages --- could somehow arrange to get space that's guaranteed zero already. And if the request isn't large, zeroing it isn't where our problem is anyhow. The most portable way to do that would be to use calloc insted of malloc, and hope that libc is smart enough to provide freshly-mapped space. It would be good to look and see whether glibc actually does so, of course. If not we might end up having to mess with sbrk for ourselves, and I'm not sure how pleasantly that interacts with malloc. Another question that would be worth asking here is whether the hand-baked MemSet macro still outruns memset on modern architectures. I think it's been quite a few years since that was last tested. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Return command tag 'REPLACE X' for CREATE OR REPLACE statements.
Robert Haas robertmh...@gmail.com writes: I think more expessive command tags are in general a good thing. The idea that this particular change would be useful primarily for humans examining the psql output seems a bit weak to me, but I can easily see it being useful for programs. Right now a program has no reason to look at this command tag anyway; it'll always be the same. Hmm ... that's a good point, although I'm not sure that it's 100% true. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rethinking representation of sort/hash semantics in queries and plans
I wrote: Now, this loss of flexibility doesn't particularly bother me, because I know of no existing or contemplated btree-substitute access methods. If one did appear on the horizon, there are a couple of ways we could fix the problem, the cleanest being to let a non-btree opfamily declare that it sorts the same as btree opfamily so-and-so. Or we could fix it locally in plancat.c by performing the lookup-the-operators-and- then-the-btree-opfamilies dance on the fly when setting up IndexOptInfo for a non-btree amcanorder index. But I'm disinclined to write such code when there's no way to test it and no foreseeable possibility that it'll ever be used. Maybe we should just make plancat.c throw a not-implemented error if amcanorder is true but it's not btree. On further reflection the last seems like clearly the thing to do. PS: the attached patch doesn't yet include removal of relcache rd_operator arrays, since that would just make the patch bigger without exposing any new issues. And here's the complete patch. regards, tom lane diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index d8fc12068fb6113c0baf8aa4e5941e0150ce3521..f73e0e6dc6007ce768828480f74621752aaf57d2 100644 *** a/src/backend/optimizer/path/indxpath.c --- b/src/backend/optimizer/path/indxpath.c *** find_usable_indexes(PlannerInfo *root, R *** 380,386 * how many of them are actually useful for this query. This is not * relevant unless we are at top level. */ ! index_is_ordered = OidIsValid(index-fwdsortop[0]); if (index_is_ordered possibly_useful_pathkeys istoplevel outer_rel == NULL) { --- 380,386 * how many of them are actually useful for this query. This is not * relevant unless we are at top level. */ ! index_is_ordered = (index-sortopfamily != NULL); if (index_is_ordered possibly_useful_pathkeys istoplevel outer_rel == NULL) { diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c index 8af0c6dc482372244b15a8a5f5a4a562e34ac91b..6325655eed84759168a51aed3f712582ec9c1f00 100644 *** a/src/backend/optimizer/path/pathkeys.c --- b/src/backend/optimizer/path/pathkeys.c *** static PathKey *make_canonical_pathkey(P *** 36,47 EquivalenceClass *eclass, Oid opfamily, int strategy, bool nulls_first); static bool pathkey_is_redundant(PathKey *new_pathkey, List *pathkeys); - static PathKey *make_pathkey_from_sortinfo(PlannerInfo *root, - Expr *expr, Oid ordering_op, - bool nulls_first, - Index sortref, - bool create_it, - bool canonicalize); static Var *find_indexkey_var(PlannerInfo *root, RelOptInfo *rel, AttrNumber varattno); static bool right_merge_direction(PlannerInfo *root, PathKey *pathkey); --- 36,41 *** canonicalize_pathkeys(PlannerInfo *root, *** 224,232 /* * make_pathkey_from_sortinfo ! * Given an expression, a sortop, and a nulls-first flag, create ! * a PathKey. If canonicalize = true, the result is a canonical ! * PathKey, otherwise not. (But note it might be redundant anyway.) * * If the PathKey is being generated from a SortGroupClause, sortref should be * the SortGroupClause's SortGroupRef; otherwise zero. --- 218,226 /* * make_pathkey_from_sortinfo ! * Given an expression and sort-order information, create a PathKey. ! * If canonicalize = true, the result is a canonical PathKey, ! * otherwise not. (But note it might be redundant anyway.) * * If the PathKey is being generated from a SortGroupClause, sortref should be * the SortGroupClause's SortGroupRef; otherwise zero. *** canonicalize_pathkeys(PlannerInfo *root, *** 240,285 */ static PathKey * make_pathkey_from_sortinfo(PlannerInfo *root, ! Expr *expr, Oid ordering_op, bool nulls_first, Index sortref, bool create_it, bool canonicalize) { - Oid opfamily, - opcintype; int16 strategy; Oid equality_op; List *opfamilies; EquivalenceClass *eclass; /* ! * An ordering operator fully determines the behavior of its opfamily, so ! * could only meaningfully appear in one family --- or perhaps two if one ! * builds a reverse-sort opfamily, but there's not much point in that ! * anymore. But EquivalenceClasses need to contain opfamily lists based ! * on the family membership of equality operators, which could easily be ! * bigger. So, look up the equality operator that goes with the ordering ! * operator (this should be unique) and get its membership. */ - - /* Find the operator in pg_amop --- failure shouldn't happen */ - if (!get_ordering_op_properties(ordering_op, - opfamily, opcintype, strategy)) - elog(ERROR, operator %u is not a valid ordering operator, -
Re: [HACKERS] Report: Linux huge pages with Postgres
On Mon, Nov 29, 2010 at 12:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: I would expect that you can just iterate through the size possibilities pretty quickly and just use the first one that works -- no /proc groveling. It's not really that easy, because (at least on the kernel version I tested) it's not the shmget that fails, it's the later shmat. Releasing and reacquiring the shm segment would require significant code restructuring, and at least on some platforms could produce weird failure cases --- I seem to recall having heard of kernels where the release isn't instantaneous, so that you could run up against SHMMAX for no apparent reason. Really you do want to scrape the value. Couldn't we just round the shared memory allocation down to a multiple of 4MB? That would handle all older architectures where the size is 2MB or 4MB. I see online that IA64 supports larger page sizes up to 256MB but then could we make it the user's problem if they change their hugepagesize to a larger value to pick a value of shared_buffers that will fit cleanly? We might need to rejigger things so that the shared memory segment is exactly the size of shared_buffers and any other shared data structures are in a separate segment though for that to work. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Feature request: INSERT .... ON DUPLICATE UPDATE ....
Dear all, In our company, we use both PostgreSQL and MySQL, our developers include me think that the INSERT ... ON DUPLICATE UPDATE clause is a much more user friendly function,so, would you please add this liked function in PostgreSQL, I know we can write PROCEDURE or RULE to solve this problem,but compare to write PROCEDURE or RULE, it's apparently more easy to add an INSERT... ON DUPLICATE clause, so, please consider our request to make PostgreSQL more power. Thanks all for great work to PostgreSQL. Daojing.Zhou 2010/11/29
Re: [HACKERS] contrib: auth_delay module
On Sat, Nov 27, 2010 at 9:25 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Nov 25, 2010 at 1:18 AM, KaiGai Kohei kai...@ak.jp.nec.com wrote: The attached patch is revised version. - Logging part within auth_delay was removed. This module now focuses on injection of a few seconds delay on authentication failed. - Documentation parts were added like any other contrib modules. Committed, with a few adjustments. Per Fujii Masao's suggestion, I changed sleep() to pg_usleep(); I also changed the GUC to be reckoned in milliseconds rather than seconds. Thanks. I found the typo: - diff --git a/contrib/README b/contrib/README index 9e223ef..8a12cc1 100644 --- a/contrib/README +++ b/contrib/README @@ -30,7 +30,7 @@ adminpack - auth_delay Add a short delay after a failed authentication attempt, to make -make brute-force attacks on database passwords a bit harder. + brute-force attacks on database passwords a bit harder. by KaiGai Kohei kai...@ak.jp.nec.com auto_explain - - Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report: Linux huge pages with Postgres
Greg Stark gsst...@mit.edu writes: On Mon, Nov 29, 2010 at 12:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: Really you do want to scrape the value. Couldn't we just round the shared memory allocation down to a multiple of 4MB? That would handle all older architectures where the size is 2MB or 4MB. Rounding *down* will not work, at least not without extremely invasive changes to the shmem allocation code. Rounding up is okay, as long as you don't mind some possibly-wasted space. I see online that IA64 supports larger page sizes up to 256MB but then could we make it the user's problem if they change their hugepagesize to a larger value to pick a value of shared_buffers that will fit cleanly? We might need to rejigger things so that the shared memory segment is exactly the size of shared_buffers and any other shared data structures are in a separate segment though for that to work. Two shmem segments would be a pretty serious PITA too, certainly a lot more so than a few lines to read a magic number from /proc. But this is all premature pending a demonstration that there's enough potential gain here to be worth taking any trouble at all. The one set of numbers we have says otherwise. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Mon, Nov 29, 2010 at 12:33 AM, Tom Lane t...@sss.pgh.pa.us wrote: The most portable way to do that would be to use calloc insted of malloc, and hope that libc is smart enough to provide freshly-mapped space. It would be good to look and see whether glibc actually does so, of course. If not we might end up having to mess with sbrk for ourselves, and I'm not sure how pleasantly that interacts with malloc. It's *supposed* to interact fine. The only thing I wonder is that I think malloc intentionally uses mmap for larger allocations but I'm not clear what the advantages are. Is it because it's a cheaper way to get zeroed bytes? Or just so that free has a hope of returning the allocations to the OS? Another question that would be worth asking here is whether the hand-baked MemSet macro still outruns memset on modern architectures. I think it's been quite a few years since that was last tested. I know glibc has some sexy memset macros for cases where the size is a constant. I'm not sure there's been much of an advance in the general case though. This would tend to imply we should consider going the other direction of having the caller of palloc0 do the zeroing instead. Or making palloc0 a macro which expands to include calling memset with the parameter inlined. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Greg Stark gsst...@mit.edu writes: On Mon, Nov 29, 2010 at 12:33 AM, Tom Lane t...@sss.pgh.pa.us wrote: Another question that would be worth asking here is whether the hand-baked MemSet macro still outruns memset on modern architectures. I think it's been quite a few years since that was last tested. I know glibc has some sexy memset macros for cases where the size is a constant. I'm not sure there's been much of an advance in the general case though. This would tend to imply we should consider going the other direction of having the caller of palloc0 do the zeroing instead. Or making palloc0 a macro which expands to include calling memset with the parameter inlined. Well, that was exactly the reason why we did it the way we do it. However, I think it's probably only node allocations where the size is likely to be constant and hence result in a win. Perhaps we should implement makeNode() differently from the general case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add a primary key using an existing index
On Fri, Nov 26, 2010 at 05:58, Steve Singer ssin...@ca.afilias.info wrote: The attached version of the patch gets your regression tests to pass. I'm going to mark this as ready for a committer. I think we need more discussions about the syntax: ALTER TABLE table_name ADD PRIMARY KEY (...) WITH (INDEX='index_name') Issues: * WITH (...) is designed for storage parameters. I think treating INDEX as a special keyword in the way might be confusable. * 'index_name' needs to be single-quoted, but object identifiers should be double-quoted literals in normal cases. * The key specifier is a duplicated option because the index has own keys. Do we need it? It might be for safety, but redundant. Note that the patch raises a reasonable error on conflict: ERROR: PRIMARY KEY/UNIQUE constraint definition does not match the index And, I found a bug: * USING INDEX TABLESPACE clause is silently ignored, even if the index uses another tablespace. After all, do we need a special syntax for the functionality? Reusing WITH (...) syntax seems to be a trouble for me. ADD PRIMARY KEY USING index_name might be a candidate, but we'd better reserve USING for non-btree PRIMARY KEY/UNIQUE indexes. Ideas and suggestions? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertion failure on hot standby
On Fri, 2010-11-26 at 01:11 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: That would mean running GetCurrentTransactionId() inside LockAcquire() if (lockmode = AccessExclusiveLock locktag-locktag_type == LOCKTAG_RELATION !RecoveryInProgress()) (void) GetCurrentTransactionId(); Any objections to that fix? Could we have a wal level test in there too please? It's pretty awful in any case... Slightly neater version of same idea applied to resolve this. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Sun, Nov 28, 2010 at 7:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: One possible way to get a real speedup here would be to look for ways to trim the number of catcaches. BTW, it's not going to help to remove catcaches that have a small initial size, as the pg_am cache certainly does. If the bucket zeroing cost is really something to minimize, it's only the caches with the largest nbuckets counts that are worth considering --- and we certainly can't remove those without penalty. Yeah, very true. What's a bit frustrating about the whole thing is that we spend a lot of time pulling data into the caches that's basically static and never likely to change anywhere, ever. I bet the number of people for whom (int4, int4) has any non-standard properties is somewhere between slim and none; and it might well be the case that formrdesc() is faster than reading the relcache init file, if we didn't need to worry about deviation from canonical. This is even more frustrating in the hypothetical situation where a backend can switch databases, because we have to blow away all of these cache entries that are 99.9% likely to be basically identical in the old and new databases. The relation descriptors for pg_class and pg_attribute are examples of things it would be nice to hardwire and never need to update. We are really pretty much screwed if there is any meaningful deviation from what is expected, but relpages, reltuples, and relfrozenxid - and maybe relacl or reloptions - can legitimately vary between databases. Maybe we could speed things up a bit if we got rid of the pg_attribute entries for the system attributes (except OID). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add a primary key using an existing index
On Sun, Nov 28, 2010 at 8:06 PM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: On Fri, Nov 26, 2010 at 05:58, Steve Singer ssin...@ca.afilias.info wrote: The attached version of the patch gets your regression tests to pass. I'm going to mark this as ready for a committer. I think we need more discussions about the syntax: ALTER TABLE table_name ADD PRIMARY KEY (...) WITH (INDEX='index_name') Why not: ALTER TABLE table_name ADD PRIMARY KEY (...) INDEX index_name; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] column-level update privs + lock table
2010/11/28 KaiGai Kohei kai...@ak.jp.nec.com: I'm not totally convinced that this is the correct behavior. It seems a bit surprising that UPDATE privilege on a single column is enough to lock out all SELECT activity from the table. It's actually a bit surprising that even full-table UPDATE privileges are enough to do this, but this change would allow people to block access to data they can neither see nor modify. That seems counterintuitive, if not a security hole. In my understanding, UPDATE privilege on a single column also allows lock out concurrent updating even if this query tries to update rows partially. Therefore, the current code considers UPDATE privilege on a single column is enough to lock out the table. Right? Against concurrent updates and deletes, yes. Against inserts that don't involve potential unique-index conflicts, and against selects, no. My comment was from a standpoint which wants consistent behavior between SELECT ... FOR and LOCK command. Again, nothing about this makes those consistent. If we concerned about this behavior, ExecCheckRTEPerms() might be a place where we also should fix. I don't understand what you're getting at here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature request: INSERT .... ON DUPLICATE UPDATE ....
On Sun, Nov 28, 2010 at 7:44 PM, Yourfriend doudou...@gmail.com wrote: In our company, we use both PostgreSQL and MySQL, our developers include me think that the INSERT ... ON DUPLICATE UPDATE clause is a much more user friendly function,so, would you please add this liked function in PostgreSQL, I know we can write PROCEDURE or RULE to solve this problem,but compare to write PROCEDURE or RULE, it's apparently more easy to add an INSERT... ON DUPLICATE clause, so, please consider our request to make PostgreSQL more power. I hope very much we'll support something like this some day, but no one is working on it presently. There is, however, ongoing work to support SQL-standard MERGE. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib: auth_delay module
On Sun, Nov 28, 2010 at 7:45 PM, Fujii Masao masao.fu...@gmail.com wrote: Thanks. I found the typo: I only have one? :-) Thanks, fixed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] column-level update privs + lock table
(2010/11/29 10:43), Robert Haas wrote: 2010/11/28 KaiGai Koheikai...@ak.jp.nec.com: I'm not totally convinced that this is the correct behavior. It seems a bit surprising that UPDATE privilege on a single column is enough to lock out all SELECT activity from the table. It's actually a bit surprising that even full-table UPDATE privileges are enough to do this, but this change would allow people to block access to data they can neither see nor modify. That seems counterintuitive, if not a security hole. In my understanding, UPDATE privilege on a single column also allows lock out concurrent updating even if this query tries to update rows partially. Therefore, the current code considers UPDATE privilege on a single column is enough to lock out the table. Right? Against concurrent updates and deletes, yes. Against inserts that don't involve potential unique-index conflicts, and against selects, no. My comment was from a standpoint which wants consistent behavior between SELECT ... FOR and LOCK command. Again, nothing about this makes those consistent. If we concerned about this behavior, ExecCheckRTEPerms() might be a place where we also should fix. I don't understand what you're getting at here. I thought the author concerned about inconsistency between them. (Perhaps, I might misunderstood his motivation?) What was the purpose that this patch tries to solve? In the first message of this topic, he concerned as follows: I noticed that granting a user column-level update privileges doesn't allow that user to issue LOCK TABLE with any mode other than Access Share. Do we need to answer: Yes, it is a specification, so you need to grant table level privileges, instead? Thanks -- KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: Yeah, very true. What's a bit frustrating about the whole thing is that we spend a lot of time pulling data into the caches that's basically static and never likely to change anywhere, ever. True. I wonder if we could do something like the relcache init file for the catcaches. Maybe we could speed things up a bit if we got rid of the pg_attribute entries for the system attributes (except OID). I used to have high hopes for that idea, but the column privileges patch broke it permanently. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] On-the-fly index tuple deletion vs. hot_standby
I have a hot_standby system and use it to bear the load of various reporting queries that take 15-60 minutes each. In an effort to avoid long pauses in recovery, I set a vacuum_defer_cleanup_age constituting roughly three hours of the master's transactions. Even so, I kept seeing recovery pause for the duration of a long-running query. In each case, the culprit record was an XLOG_BTREE_DELETE arising from on-the-fly deletion of an index tuple. The attached test script demonstrates the behavior (on HEAD); the index tuple reclamation conflicts with a concurrent SELECT pg_sleep(600) on the standby. Since this inserting transaction aborts, HeapTupleSatisfiesVacuum reports HEAPTUPLE_DEAD independent of vacuum_defer_cleanup_age. We go ahead and remove the index tuples. On the standby, btree_xlog_delete_get_latestRemovedXid does not regard the inserting-transaction outcome, so btree_redo proceeds to conflict with snapshots having visibility over that transaction. Could we correctly improve this by teaching btree_xlog_delete_get_latestRemovedXid to ignore tuples of aborted transactions and tuples inserted and deleted within one transaction? Thanks, nm repro-btree-cleanup.sh Description: Bourne shell script -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
On Fri, Nov 26, 2010 at 06:24, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Thanks for your review. Please find attached a revised patch where I've changed the internals of the function so that it's split in two and that the opr_sanity check passes, per comments from David Wheeler and Tom Lane. I have some comments and questions about pg_execute_from_file.v5.patch. Source code * OID=3627 is used by another function. Don't you expect 3927? * There is a compiler warning: genfile.c: In function ‘pg_execute_from_file_with_placeholders’: genfile.c:427: warning: unused variable ‘query_string’ * I'd like to ask native speakers whether from is needed in names of pg_execute_from_file and pg_execute_from_query_string. Design and Implementation * pg_execute_from_file() can execute any files even if they are not in $PGDATA. OTOH, we restrict pg_read_file() to read such files. What will be our policy? Note that the contents of file might be logged or returned to the client on errors. * Do we have any reasons to have pg_execute_from_file separated from pg_read_file ? If we allow pg_read_file() to read files in $PGSHARE, pg_execute_from_file could be replaced with EXECUTE pg_read_file(). (Note that pg_execute_from_file is implemented to do so even now.) * I hope pg_execute_from_file (and pg_read_file) had an option to specify an character encoding of the file. Especially, SJIS is still used widely, but it is not a valid server encoding. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add a primary key using an existing index
On Sun, Nov 28, 2010 at 08:40:08PM -0500, Robert Haas wrote: On Sun, Nov 28, 2010 at 8:06 PM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: On Fri, Nov 26, 2010 at 05:58, Steve Singer ssin...@ca.afilias.info wrote: The attached version of the patch gets your regression tests to pass. I'm going to mark this as ready for a committer. I think we need more discussions about the syntax: ALTER TABLE table_name ADD PRIMARY KEY (...) WITH (INDEX='index_name') Why not: ALTER TABLE table_name ADD PRIMARY KEY (...) INDEX index_name; +1 :) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers