Re: [HACKERS] SQL/MED estimated time of arrival?
On Tue, 23 Nov 2010 10:22:02 -0800 Joshua D. Drake j...@commandprompt.com wrote: On Tue, 2010-11-23 at 20:18 +0200, Heikki Linnakangas wrote: On 23.11.2010 14:22, Shigeru HANADA wrote: OID is supported to get oid from the source table (yes, it works only for postgresql_fdw but it seems useful to support). I don't think that's worthwhile. Oids on user tables is a legacy feature, not recommended for new applications. Agreed. We should do everything we can to NOT encourage their use. Agreed. I'll remove OIDs support and repost the patch in new thread. Regards, -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: too many KnownAssignedXids
On 24.11.2010 06:56, Joachim Wieland wrote: On Tue, Nov 23, 2010 at 8:45 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 19.11.2010 23:46, Joachim Wieland wrote: FATAL: too many KnownAssignedXids. head: 0, tail: 0, nxids: 9978, pArray-maxKnownAssignedXids: 6890 Hmm, that's a lot of entries in KnownAssignedXids. Can you recompile with WAL_DEBUG, and run the recovery again with wal_debug=on ? That will print all the replayed WAL records, which is a lot of data, but it might give a hint what's going on. Sure, but this gives me only one more line: [...] LOG: redo starts at 1F8/FC00E978 LOG: REDO @ 1F8/FC00E978; LSN 1F8/FC00EE90: prev 1F8/FC00E930; xid 385669; len 21; bkpb1: Heap - insert: rel 1663/16384/18373; tid 3829898/23 FATAL: too many KnownAssignedXids CONTEXT: xlog redo insert: rel 1663/16384/18373; tid 3829898/23 LOG: startup process (PID 4587) exited with exit code 1 LOG: terminating any other active server processes Thanks, I can reproduce this now. This happens when you have a wide gap between the oldest still active xid and the latest xid. When recovery starts, we fetch the oldestActiveXid from the checkpoint record. Let's say that it's 100. We then start replaying WAL records from the Redo pointer, and the first record (heap insert in your case) contains an Xid that's much larger than 100, say 1. We call RecordKnownAssignedXids() to make note that all xids between that range are in-progress, but there isn't enough room in the array for that. We normally get away with a smallish array because the array is trimmed at commit and abort records, and the special xid-assignment record to handle the case of a lot of subtransactions. We initialize the array from the running-xacts record that's written at a checkpoint. That mechanism fails in this case because the heap insert record is seen before the running-xacts record, causing all those xids in the range 100-1 to be considered in-progress. The running-xacts record that comes later would prune them, but we don't have enough slots to hold them until that. Hmm. I'm not sure off the top of my head how to fix that. Perhaps stash the xids we see during WAL replay in private memory instead of putting them in the KnownAssignedXids array until we see the running-xacts record. To reproduce this, I did this in the master: postgres=# CREATE FUNCTION insertfunc(n integer) RETURNS VOID AS $$ declare i integer; begin FOR i IN 1..n LOOP BEGIN INSERT INTO foo VALUES (1); EXCEPTION WHEN division_by_zero THEN RAISE NOTICE 'divbyzero'; END; END LOOP; end; $$ LANGUAGE plpgsql; postgres=# SELECT insertfunc(1); After letting that run for a while, so that a couple of checkpoints have occurred, kill the master and start standby to recover that from archive. After it has replayed all the WAL, stop the standby and restart it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: too many KnownAssignedXids
On 24.11.2010 12:48, Heikki Linnakangas wrote: On 24.11.2010 06:56, Joachim Wieland wrote: On Tue, Nov 23, 2010 at 8:45 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 19.11.2010 23:46, Joachim Wieland wrote: FATAL: too many KnownAssignedXids. head: 0, tail: 0, nxids: 9978, pArray-maxKnownAssignedXids: 6890 Hmm, that's a lot of entries in KnownAssignedXids. Can you recompile with WAL_DEBUG, and run the recovery again with wal_debug=on ? That will print all the replayed WAL records, which is a lot of data, but it might give a hint what's going on. Sure, but this gives me only one more line: [...] LOG: redo starts at 1F8/FC00E978 LOG: REDO @ 1F8/FC00E978; LSN 1F8/FC00EE90: prev 1F8/FC00E930; xid 385669; len 21; bkpb1: Heap - insert: rel 1663/16384/18373; tid 3829898/23 FATAL: too many KnownAssignedXids CONTEXT: xlog redo insert: rel 1663/16384/18373; tid 3829898/23 LOG: startup process (PID 4587) exited with exit code 1 LOG: terminating any other active server processes Thanks, I can reproduce this now. This happens when you have a wide gap between the oldest still active xid and the latest xid. When recovery starts, we fetch the oldestActiveXid from the checkpoint record. Let's say that it's 100. We then start replaying WAL records from the Redo pointer, and the first record (heap insert in your case) contains an Xid that's much larger than 100, say 1. We call RecordKnownAssignedXids() to make note that all xids between that range are in-progress, but there isn't enough room in the array for that. We normally get away with a smallish array because the array is trimmed at commit and abort records, and the special xid-assignment record to handle the case of a lot of subtransactions. We initialize the array from the running-xacts record that's written at a checkpoint. That mechanism fails in this case because the heap insert record is seen before the running-xacts record, causing all those xids in the range 100-1 to be considered in-progress. The running-xacts record that comes later would prune them, but we don't have enough slots to hold them until that. Hmm. I'm not sure off the top of my head how to fix that. Perhaps stash the xids we see during WAL replay in private memory instead of putting them in the KnownAssignedXids array until we see the running-xacts record. Looking closer at RecordKnownAssignedTransactionIds(), there's a related much more serious bug there too. When latestObservedXid is initialized to the oldest still-running xid, oldestActiveXid, at WAL recovery, we zero the CLOG starting from the oldestActiveXid. That will zap away the clog bits of any old transactions that had already committed before the checkpoint started, but were younger than the oldest still running transaction. The transactions will be lost :-(. It's dangerous to initialize latestObservedXid to anything to an older value. The idea of keeping the seen xids in a temporary list private to the startup process until the running-xacts record would solve that problem too. ProcArrayInitRecoveryInfo() would not be needed anymore, the KnownAssignedXids tracking would start at the first running-xacts record (or shutdown checkpoint) we see, not any sooner than that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: too many KnownAssignedXids
On 24.11.2010 13:38, Heikki Linnakangas wrote: It's dangerous to initialize latestObservedXid to anything to an older value. older value than the nextXid-1 from the checkpoint record, I meant to say. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Instrument checkpoint sync calls
On Wed, Nov 24, 2010 at 1:02 AM, Greg Smith g...@2ndquadrant.com wrote: Robert Haas wrote: Did this get eaten by the email goblin, or you're still working on it? Fell behind due to an unfortunately timed bit of pneumonia. Hurray for the health benefits of cross country flights. Will fix this up, rebase my other patch, and head toward some more review/'Fest cleanup now that I'm feeling better. Ouch. Fringe benefits of consulting. Thanks for the update. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 2:10 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Anything we can do about this? That's a lot of overhead, and it'd be a lot worse on a big machine with 8GB of shared_buffers. Micro-optimizing that search for the non-zero value helps a little bit (attached). Reduces the percentage shown by oprofile from about 16% to 12% on my laptop. For bigger gains, The first optimization that occurred to me was remove the loop altogether. I could maybe see needing to do something like this if we're recovering from an error, but why do we need to do this (except perhaps to fail an assertion) if we're exiting cleanly? Even a session-lifetime buffer-pin leak would be quite disastrous, one would think. Now, the other question is if this really matters. Even if we eliminate that loop in AtProcExit_Buffers altogether, is connect/disconnect still be so slow that you have to use a connection pooler if you do that a lot? Oh, I'm sure this isn't going to be nearly enough to fix that problem, but every little bit helps; and if we never do the first optimization, we'll never get to #30 or wherever it is that it really starts to move the needle. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
On Wed, Nov 24, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: Josh Berkus j...@agliodbs.com writes: Well, very little about pg_dump is very [E], IMNSHO. The question in my mind here is what format the list file will take I was thinking same format as pg_restore -l, only without the dumpIDs. Nope ... those strings are just helpful comments, they aren't really guaranteed to be unique identifiers. In any case, it seems unlikely that a user could expect to get the more complicated cases exactly right other than by consulting pg_dump | pg_restore -l output. Which makes the use-case kind of dubious to me. I don't say that this wouldn't be a useful feature, but you need a better spec than this. One thing I've often wished for is the ability to dump a specific function (usually right after after I accidentally rm the file the source code was in). pg_dump has -t to pick a table, but there's no analagous way to select an object that isn't a relation. I think the first step here would be to design a system that lets you use a command-line argument to dump an arbitrary object, and after that you could work on reading the object descriptors from a file rather than the command line. As a first attempt at syntax, I might suggest something along the lines of object type: object name, where the types and names might look to COMMENT ON for inspiration. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] final patch - plpgsql: for-in-array
On Wed, Nov 24, 2010 at 1:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: =?ISO-8859-1?Q?C=E9dric_Villemain?= cedric.villemain.deb...@gmail.com writes: I think you (Robert) misunderstood dramatically what Pavel try to do. Pavel did an excellent optimization work for a specific point. This specific point looks crucial for me in the current behavior of PostgreSQL[1]. AFAIS Pavel didn't want to implement a genious syntax, but an optimization feature. As near as I can tell, Pavel is bullheadedly insisting on adding new syntax, not on the optimization aspect of it. I already pointed out how he could get 100% of the performance benefit using the existing syntax, but he doesn't appear to be willing to pursue that route. Right, that was my impression, too. But, I think this may be partly a case of people talking past each other. My impression of this conversation was a repetition of this sequence: A: This syntax is bad. B: But it's way faster! ...which makes no sense. However, what I now think is going on here is that there are really two separate things that are wished for here - a more compact syntax, and a performance improvement. And taken separately, I agree with both of those desires. PL/pgsql is an incredibly clunky language syntactically, and it's also slow. A patch that improves either one of those things has value, whether or not it also does the other one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
Hey hackers, Completely agree with Robert ! It would be nice to dump functions definitions, e.g. to make it possible keep them in git separately. I also want to propose to make it possible dump function definitions as CREATE OR REPLACE FUNCTION rather than just CREATE FUNCTION (as pg_dump dumps them now). It is would be useful as well as psql's \ef. 2010/11/24 Robert Haas robertmh...@gmail.com On Wed, Nov 24, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: Josh Berkus j...@agliodbs.com writes: Well, very little about pg_dump is very [E], IMNSHO. The question in my mind here is what format the list file will take I was thinking same format as pg_restore -l, only without the dumpIDs. Nope ... those strings are just helpful comments, they aren't really guaranteed to be unique identifiers. In any case, it seems unlikely that a user could expect to get the more complicated cases exactly right other than by consulting pg_dump | pg_restore -l output. Which makes the use-case kind of dubious to me. I don't say that this wouldn't be a useful feature, but you need a better spec than this. One thing I've often wished for is the ability to dump a specific function (usually right after after I accidentally rm the file the source code was in). pg_dump has -t to pick a table, but there's no analagous way to select an object that isn't a relation. I think the first step here would be to design a system that lets you use a command-line argument to dump an arbitrary object, and after that you could work on reading the object descriptors from a file rather than the command line. As a first attempt at syntax, I might suggest something along the lines of object type: object name, where the types and names might look to COMMENT ON for inspiration. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- // Dmitriy.
Re: [HACKERS] Extensions, this time with a patch
On Tue, Nov 23, 2010 at 18:19, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Please find that done in attached v4 of the cfparser patch. RECOVERY_COMMAND_FILE is opened twice in the patch. The first time is for checking the existence, and the second time is for parsing. Instead of the repeat, how about adding FILE* version of parser? It will be also called from ParseConfigFile() as a sub routine. bool ParseConfigFd(FILE *fd, const char *config_file, int depth, ...) BTW, the parser supports include and custom_variable_classes not only for postgresql.conf but also for all files. Is it an intended behavior? I think they are harmless, so we don't have to change the codes; include might be useful even in recovery.conf, and custom_variable_classes will be unrecognized recovery parameter error after all. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
On 11/24/2010 07:29 AM, Robert Haas wrote: As a first attempt at syntax, I might suggest something along the lines of object type: object name, where the types and names might look to COMMENT ON for inspiration. pg_dump already uses a list of object types (e.g. as seen in the output from pg_restore --list). Let's not invent something new if we don't need to. But we'll need more than that. We'll need enough for disambiguation, especially for functions which is your stated use case. So, something like this might work: FUNCTION:myschema.myfunc:integer,text,timestamp with time zone We could allow the object type to compare case insensitively. For extra credit, allow type aliases, and don't require argument types if no disambiguation is needed. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
On Wed, Nov 24, 2010 at 8:41 AM, Andrew Dunstan and...@dunslane.net wrote: On 11/24/2010 07:29 AM, Robert Haas wrote: As a first attempt at syntax, I might suggest something along the lines of object type: object name, where the types and names might look to COMMENT ON for inspiration. pg_dump already uses a list of object types (e.g. as seen in the output from pg_restore --list). Let's not invent something new if we don't need to. But we'll need more than that. We'll need enough for disambiguation, especially for functions which is your stated use case. So, something like this might work: FUNCTION:myschema.myfunc:integer,text,timestamp with time zone Actually, that's pretty much exactly what I was proposing, except that I would've kept the existing convention for how to write a function's arguments: FUNCTION:myschema.myfunc(integer,text,timestamp with time zone) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] requested feature: limit for text or bytea parameters in log
Hello We use a very rich log - for pgFouine processing. But sometime there are logger 1MB parameters. It's absolutely useless. These long values can be replaced by first n chars ... truncated original length: 223636, md5: jhagjkafhskdjfhdsf Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
On Wed, Nov 24, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: Nope ... those strings are just helpful comments, they aren't really guaranteed to be unique identifiers. In any case, it seems unlikely that a user could expect to get the more complicated cases exactly right other than by consulting pg_dump | pg_restore -l output. Which makes the use-case kind of dubious to me. In which case would the catalogId, i.e. (tableoid, oid) not be unique? Or do you rather mean that it does not necessarily refer to the same object if that object got somehow recreated or that it could be different on different installations of the same database? Thanks, Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
On 11/24/2010 09:05 AM, Joachim Wieland wrote: On Wed, Nov 24, 2010 at 1:15 AM, Tom Lanet...@sss.pgh.pa.us wrote: Nope ... those strings are just helpful comments, they aren't really guaranteed to be unique identifiers. In any case, it seems unlikely that a user could expect to get the more complicated cases exactly right other than by consulting pg_dump | pg_restore -l output. Which makes the use-case kind of dubious to me. In which case would the catalogId, i.e. (tableoid, oid) not be unique? Or do you rather mean that it does not necessarily refer to the same object if that object got somehow recreated or that it could be different on different installations of the same database? It would be unique, but a pain in the neck for users to get. Robert's idea will have more traction with users. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
Joachim Wieland j...@mcknight.de writes: On Wed, Nov 24, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: Nope ... those strings are just helpful comments, they aren't really guaranteed to be unique identifiers. In any case, it seems unlikely that a user could expect to get the more complicated cases exactly right other than by consulting pg_dump | pg_restore -l output. Which makes the use-case kind of dubious to me. In which case would the catalogId, i.e. (tableoid, oid) not be unique? Catalog OID + object OID would be unique, but surely we don't want to make users deal in specifying the objects to be dumped with that. Actually, what occurs to me to wonder is whether the facility has to be guaranteed unique at all. If for instance you have a group of overloaded functions, is there really a big use-case for dumping just one and not the whole group? Even if you think there's some use for it, is it big enough to justify a quantum jump in the complexity of the feature? Here's a radically simplified proposal: provide a switch --object-name=pattern where pattern follows the same rules as in psql \d commands (just to use something users will already know). Dump every object, of any type, whose qualified name matches the pattern, ie the same objects that would be shown by \d (of the relevant type) using the pattern. Accept multiple occurrences of the switch and dump the union of the matched objects. (Now that I think about it, this is the same as the existing --table switch, just generalized to match any object type.) There would be some cases where this'd dump more than you really want, but I think it'd make up for that in ease-of-use. It's not clear to me that dumping a few extra objects is a big problem except for the case where the objects are large tables, and in that case if you aren't specifying a sufficiently exact pattern, it's your own fault not a limitation of the feature. BTW, what about dependencies? One of the main complaints we've heard about pg_restore's filtering features is that they are not smart about including things like the indexes of a selected table, or the objects it depends on (eg, functions referred to in CHECK constraints). I'm not sure that a pure name-based filter will be any more usable than pg_restore's filter, if there is no accounting for dependencies. The risk of not including dependencies at dump time is vastly higher than in pg_restore, too, since by the time you realize you omitted something critical it may be too late to go back and get another dump. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
On Wed, Nov 24, 2010 at 9:52 AM, Tom Lane t...@sss.pgh.pa.us wrote: Actually, what occurs to me to wonder is whether the facility has to be guaranteed unique at all. If for instance you have a group of overloaded functions, is there really a big use-case for dumping just one and not the whole group? Even if you think there's some use for it, is it big enough to justify a quantum jump in the complexity of the feature? Well, I think that being able to dump one specific function is a pretty important use case. But I don't see why that's necessarily irreconcilable with your suggested syntax of --function=pattern, if we assume that the pattern is being matched against pg_proc.oid::regprocedure and define the matching rules such that foo(text) will not match sfoo(text). Nothing anyone has proposed sounds like a quantum jump in complexity over your proposal. BTW, what about dependencies? One of the main complaints we've heard about pg_restore's filtering features is that they are not smart about including things like the indexes of a selected table, or the objects it depends on (eg, functions referred to in CHECK constraints). I'm not sure that a pure name-based filter will be any more usable than pg_restore's filter, if there is no accounting for dependencies. I am 100% positive that it will be a big step forward. I think the dependency issue is vaguely interesting, but less important and orthogonal. This will be very useful for cherry-picking an object from one server or database and loading it into another, a need that comes up with some frequency. Sure, it'd be handy to be able to cherry-pick the dependencies automatically, but you don't always need or even want that - you may know that they are already present in the target DB. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: too many KnownAssignedXids
On Wed, 2010-11-24 at 12:48 +0200, Heikki Linnakangas wrote: When recovery starts, we fetch the oldestActiveXid from the checkpoint record. Let's say that it's 100. We then start replaying WAL records from the Redo pointer, and the first record (heap insert in your case) contains an Xid that's much larger than 100, say 1. We call RecordKnownAssignedXids() to make note that all xids between that range are in-progress, but there isn't enough room in the array for that. Agreed. Hmm. I'm not sure off the top of my head how to fix that. Perhaps stash the xids we see during WAL replay in private memory instead of putting them in the KnownAssignedXids array until we see the running-xacts record. Moving LogStandbySnapshot() earlier will help but won't solve it fully. Will think. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 2:10 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Micro-optimizing that search for the non-zero value helps a little bit (attached). Reduces the percentage shown by oprofile from about 16% to 12% on my laptop. That micro-optimization looks to me like your compiler leaves something to be desired. The first optimization that occurred to me was remove the loop altogether. Or make it execute only in assert-enabled mode, perhaps. This check had some use back in the bad old days, but the ResourceOwner mechanism has probably removed a lot of the argument for it. The counter-argument might be that failing to remove a buffer pin would be disastrous; but I can't see that it'd be worse than failing to remove an LWLock, and we have no belt-and-suspenders-too loop for those. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
On Wed, Nov 24, 2010 at 9:38 AM, Andrew Dunstan and...@dunslane.net wrote: It would be unique, but a pain in the neck for users to get. Robert's idea will have more traction with users. Whatever approach we use, we need to think about the use case where 1% of the objects should be dumped but should also make sure that you can more or less easily dump 99% of the objects. Roberts use case is the 1% use case. Especially for the 99% case however, pg_dump needs to create a full list of all available objects in whatever format as a proposal so that you could just save edit this list and then delete what you don't want instead of writing such a list from scratch. Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] final patch - plpgsql: for-in-array
Robert Haas robertmh...@gmail.com writes: Right, that was my impression, too. But, I think this may be partly a case of people talking past each other. My impression of this conversation was a repetition of this sequence: A: This syntax is bad. B: But it's way faster! ...which makes no sense. However, what I now think is going on here is that there are really two separate things that are wished for here - a more compact syntax, and a performance improvement. And taken separately, I agree with both of those desires. PL/pgsql is an incredibly clunky language syntactically, and it's also slow. A patch that improves either one of those things has value, whether or not it also does the other one. I understand the desire for nicer syntax, in the abstract. I'm just unimpressed by this particular change, mainly because I'm afraid that it will make syntax-error behaviors worse and foreclose future options for other changes to FOR. If it were necessary to change the syntax to get the performance benefit, I might think that on balance we should do so; but it isn't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
Dmitriy Igrishin dmit...@gmail.com writes: I also want to propose to make it possible dump function definitions as CREATE OR REPLACE FUNCTION rather than just CREATE FUNCTION (as pg_dump dumps them now). It's intentional that pg_dump doesn't do that. Please don't think that pg_dump is a substitute for \ef. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 9:52 AM, Tom Lane t...@sss.pgh.pa.us wrote: Actually, what occurs to me to wonder is whether the facility has to be guaranteed unique at all. If for instance you have a group of overloaded functions, is there really a big use-case for dumping just one and not the whole group? Even if you think there's some use for it, is it big enough to justify a quantum jump in the complexity of the feature? Well, I think that being able to dump one specific function is a pretty important use case. But I don't see why that's necessarily irreconcilable with your suggested syntax of --function=pattern, if we assume that the pattern is being matched against pg_proc.oid::regprocedure and define the matching rules such that foo(text) will not match sfoo(text). Nothing anyone has proposed sounds like a quantum jump in complexity over your proposal. It *will* be manifestly harder to use if users have to spell the argument types just so. Consider int4 versus integer, varchar versus character varying (and not character varying(N)), etc etc. I think that leaving out the argument types is something we should very strongly consider here. BTW, what about dependencies? One of the main complaints we've heard about pg_restore's filtering features is that they are not smart about including things like the indexes of a selected table, or the objects it depends on (eg, functions referred to in CHECK constraints). I'm not sure that a pure name-based filter will be any more usable than pg_restore's filter, if there is no accounting for dependencies. I am 100% positive that it will be a big step forward. Apparently you haven't been reading pgsql-bugs and pgsql-novice for the last five or ten years. These are large problems in practice. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
Joachim Wieland j...@mcknight.de writes: Whatever approach we use, we need to think about the use case where 1% of the objects should be dumped but should also make sure that you can more or less easily dump 99% of the objects. Roberts use case is the 1% use case. Especially for the 99% case however, pg_dump needs to create a full list of all available objects in whatever format as a proposal so that you could just save edit this list and then delete what you don't want instead of writing such a list from scratch. For that I'd suggest an --exclude=pattern switch. I'm really not happy with the idea of applying pg_restore's -l then -L workflow to dumps from a live database. It's workable for pg_restore because the dump file doesn't change underneath you between the two runs. But having to make a list for pg_dump seems like a foot-gun. Imagine a DBA who wants to exclude a large log table from his dumps, so he makes a -l-like list and removes that table, sets up the cron job to use that list, and forgets about it. Months later, he finds out that his backups don't contain $critical-object-added-later. A static external list of objects to be dumped just doesn't make sense to me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
On Wed, Nov 24, 2010 at 10:45 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 9:52 AM, Tom Lane t...@sss.pgh.pa.us wrote: Actually, what occurs to me to wonder is whether the facility has to be guaranteed unique at all. If for instance you have a group of overloaded functions, is there really a big use-case for dumping just one and not the whole group? Even if you think there's some use for it, is it big enough to justify a quantum jump in the complexity of the feature? Well, I think that being able to dump one specific function is a pretty important use case. But I don't see why that's necessarily irreconcilable with your suggested syntax of --function=pattern, if we assume that the pattern is being matched against pg_proc.oid::regprocedure and define the matching rules such that foo(text) will not match sfoo(text). Nothing anyone has proposed sounds like a quantum jump in complexity over your proposal. It *will* be manifestly harder to use if users have to spell the argument types just so. Consider int4 versus integer, varchar versus character varying (and not character varying(N)), etc etc. I think that leaving out the argument types is something we should very strongly consider here. I don't see why this is an either/or question. Can't we make them optional? BTW, what about dependencies? One of the main complaints we've heard about pg_restore's filtering features is that they are not smart about including things like the indexes of a selected table, or the objects it depends on (eg, functions referred to in CHECK constraints). I'm not sure that a pure name-based filter will be any more usable than pg_restore's filter, if there is no accounting for dependencies. I am 100% positive that it will be a big step forward. Apparently you haven't been reading pgsql-bugs and pgsql-novice for the last five or ten years. These are large problems in practice. That seems like a cheap shot, since you already know that I haven't been reading any of the mailing lists for that long. I have, however, been using PostgreSQL for that long, and I've hit this problem myself. I don't say that being able to dump an exact object and nothing more will solve every use case, but I do say it's useful, at least to me. I've wanted it many times. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The first optimization that occurred to me was remove the loop altogether. Or make it execute only in assert-enabled mode, perhaps. This check had some use back in the bad old days, but the ResourceOwner mechanism has probably removed a lot of the argument for it. Yeah, that's what I was thinking - this could would have been a good backstop when our cleanup mechanisms were not as robust as they seem to be today. But making the check execute only in assert-enabled more doesn't seem right, since the check actually acts to mask other coding errors, rather than reveal them. Maybe we replace the check with one that only occurs in an Assert-enabled build and just loops through and does Assert(PrivateRefCount[i] == 0). I'm not sure exactly where this gets called in the shutdown sequence, though - is it sensible to Assert() here? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote: Or make it execute only in assert-enabled mode, perhaps. But making the check execute only in assert-enabled more doesn't seem right, since the check actually acts to mask other coding errors, rather than reveal them. Maybe we replace the check with one that only occurs in an Assert-enabled build and just loops through and does Assert(PrivateRefCount[i] == 0). Yeah, that would be sensible. There is precedent for this elsewhere too; I think there's a similar setup for checking buffer refcounts during transaction cleanup. I'm not sure exactly where this gets called in the shutdown sequence, though - is it sensible to Assert() here? Assert is sensible anywhere. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggested easy TODO: pg_dump --from-list
Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 10:45 AM, Tom Lane t...@sss.pgh.pa.us wrote: It *will* be manifestly harder to use if users have to spell the argument types just so. Consider int4 versus integer, varchar versus character varying (and not character varying(N)), etc etc. I think that leaving out the argument types is something we should very strongly consider here. I don't see why this is an either/or question. Can't we make them optional? That might work. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] final patch - plpgsql: for-in-array
Hello 2010/11/24 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: Right, that was my impression, too. But, I think this may be partly a case of people talking past each other. My impression of this conversation was a repetition of this sequence: A: This syntax is bad. B: But it's way faster! ...which makes no sense. However, what I now think is going on here is that there are really two separate things that are wished for here - a more compact syntax, and a performance improvement. And taken separately, I agree with both of those desires. PL/pgsql is an incredibly clunky language syntactically, and it's also slow. A patch that improves either one of those things has value, whether or not it also does the other one. I understand the desire for nicer syntax, in the abstract. I'm just unimpressed by this particular change, mainly because I'm afraid that it will make syntax-error behaviors worse and foreclose future options for other changes to FOR. If it were necessary to change the syntax to get the performance benefit, I might think that on balance we should do so; but it isn't. I am for any readable syntax. It must not be FOR-IN-ARRAY. I understand to problem with syntax-error checking. But I am not sure if some different loop with control variable can be less ugliness in language. Cannot we rewrite a parsing for-clause be more robust? I agree with you, so there can be a other request in future. And if I remember well, there was only few changes in other statements (on parser level) and significant changes in FOR. probably some hypothetical statement should be (my opinion) FOR var [, vars] UNKNOWN SYNTAX LOOP ... END LOOP; PL/SQL uses a enhanced FOR FOR var IN collection.first .. collection.last LOOP END LOOP; From my view a introduction of new keyword should be a higher risk so I don't would to do. Regards Pavel Stehule regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] JDBC and Binary protocol error, for some statements
Result is oid=23, format=(0) T, value = 0x00,0x00,0x00,0x02 What do you mean regarding the format? Are you just inferring that from the data? If memory serves, the format of a particular column is not specified anywhere other than the RowDescription, and according to your JDBC log output above, the server is telling you the format is text (1) (which is your point--it doesn't match the resulting data--but I want to make sure we're clear on what's actually going on). Also, can you narrow this down to a simple, self-contained test case (with code)? Even if it's against a custom driver build, that would be easier to investigate. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 11:33 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote: Or make it execute only in assert-enabled mode, perhaps. But making the check execute only in assert-enabled more doesn't seem right, since the check actually acts to mask other coding errors, rather than reveal them. Maybe we replace the check with one that only occurs in an Assert-enabled build and just loops through and does Assert(PrivateRefCount[i] == 0). Yeah, that would be sensible. There is precedent for this elsewhere too; I think there's a similar setup for checking buffer refcounts during transaction cleanup. I'm not sure exactly where this gets called in the shutdown sequence, though - is it sensible to Assert() here? Assert is sensible anywhere. OK, patch attached. Here's what oprofile output looks like with this applied: 3505 10.4396 libc-2.11.2.so memset 2051 6.1089 libc-2.11.2.so memcpy 1686 5.0217 postgres AllocSetAlloc 1642 4.8907 postgres hash_search_with_hash_value 1247 3.7142 libc-2.11.2.so _int_malloc 1096 3.2644 libc-2.11.2.so fread 855 2.5466 ld-2.11.2.so do_lookup_x 723 2.1535 ld-2.11.2.so _dl_fixup 645 1.9211 ld-2.11.2.so strcmp 620 1.8467 postgres MemoryContextAllocZero Somehow I don't think I'm going to get much further with this without figuring out how to get oprofile to cough up a call graph. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company AtProcExit_Buffers.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: OK, patch attached. Two comments: 1. A comment would help, something like Assert we released all buffer pins. 2. AtProcExit_LocalBuffers should be redone the same way, for consistency (it likely won't make any performance difference). Note the comment for AtProcExit_LocalBuffers, too; that probably needs to be changed along the lines of If we missed any, and assertions aren't enabled, we'll fail later in DropRelFileNodeBuffers while trying to drop the temp rels. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] final patch - plpgsql: for-in-array
On Wed, Nov 24, 2010 at 10:33 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Right, that was my impression, too. But, I think this may be partly a case of people talking past each other. My impression of this conversation was a repetition of this sequence: A: This syntax is bad. B: But it's way faster! ...which makes no sense. However, what I now think is going on here is that there are really two separate things that are wished for here - a more compact syntax, and a performance improvement. And taken separately, I agree with both of those desires. PL/pgsql is an incredibly clunky language syntactically, and it's also slow. A patch that improves either one of those things has value, whether or not it also does the other one. I understand the desire for nicer syntax, in the abstract. I'm just unimpressed by this particular change, mainly because I'm afraid that it will make syntax-error behaviors worse and foreclose future options for other changes to FOR. If it were necessary to change the syntax to get the performance benefit, I might think that on balance we should do so; but it isn't. It'd be nicer syntax if there were some way to have the keyword not adjacent to the arbitrary expression. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4-vintage problem in postmaster.c
On 11/15/2010 03:24 PM, Alvaro Herrera wrote: Excerpts from Tom Lane's message of sáb nov 13 19:07:50 -0300 2010: Stefan Kaltenbrunnerste...@kaltenbrunner.cc writes: On 11/13/2010 06:58 PM, Tom Lane wrote: Just looking at it, I think that the logic in canAcceptConnections got broken by somebody in 8.4, and then broken some more in 9.0: in some cases it will return an okay to proceed status without having checked for TOOMANY children. Was this system possibly in PM_WAIT_BACKUP or PM_HOT_STANDBY state? What version was actually running? I don't have too many details on the actual setup (working on that) but the box in question is running 8.4.2 and had no issues before the upgrade to 8.4 (ie 8.3 was reported to work fine - so a 8.4+ breakage looks plausible). Well, this failure would certainly involve a flood of connection attempts, so it's possible it's a pre-existing bug that they just did not happen to trip over before. But the sequence of events that I'm thinking about is a smart shutdown attempt (SIGTERM to postmaster) while an online backup is in progress, followed by a flood of near-simultaneous connection attempts while the backup is still active. As far as I could gather from Stefan's description, I think this is pretty unlikely. It seems to me that the too many children error message is very common in the 8.3 setup already, and the only reason they have a problem on 8.4 is that it crashes instead. not sure if that is true - but 8.4 crashes whereas 8.3 just (seems to) works - the issue is still there with 8_4_STABLE... DEBUG3 level output (last few hours - 7MB in size) is available under http://www.kaltenbrunner.cc/files/postgresql-2010-11-24_143513.log From looking at the code I'm not immediatly seeing what is going wrong here but maybe somebody else has an idea. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] JDBC and Binary protocol error, for some statements
I didn't described log correctly, 1st attached response is normal execution; flags QUERY_SUPPRESS_BEGIN | QUERY_ONESHOT, 2nd is compiled statement QUERY_SUPPRESS_BEGIN only. Text format is marked as 0, binary format is 1. The 1st shown execution (flags=17) is good, it tells that result is sended in binary format, as int4, but 2nd one (flags=16) (statement compiled) result is bad because Server described row as text 07:52:06.061 (54) =BE RowDescription(1) 07:52:06.061 (54) Field(,INT4,4,T) but recieved tuple was clearly in binary format, 0x00, 0x00, 0x00, 0x02. (Look at length should be 1 or 2 if it's text format and value 2) Speaking it simple, server said you will recive text data, but sent it as binary encoded int. I've checked this againt 8.4 and 9.0.1. Maciek Sakrejda msakre...@truviso.com Wednesday 24 November 2010 18:02:27 Result is oid=23, format=(0) T, value = 0x00,0x00,0x00,0x02 What do you mean regarding the format? Are you just inferring that from the data? If memory serves, the format of a particular column is not specified anywhere other than the RowDescription, and according to your JDBC log output above, the server is telling you the format is text (1) (which is your point--it doesn't match the resulting data--but I want to make sure we're clear on what's actually going on). It's jdbc2.PreparedStatementTest, form JDBC driver unit tests. I've exposed sources here http://www.rsmogura.net/pgsql/pgjdbc_exp_20101124.tar.gz compiled driver and unit tests are in parent directory. In above all not related to this bug tests has been commented, just run ant test. Also, can you narrow this down to a simple, self-contained test case (with code)? Even if it's against a custom driver build, that would be easier to investigate. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com Kind regards, Radek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 1:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: OK, patch attached. Two comments: Revised patch attached. I tried configuring oprofile with --callgraph=10 and then running oprofile with -c, but it gives kooky looking output I can't interpret. For example: 642.8571 postgres record_in 857.1429 postgres pg_perm_setlocale 17035 5.7219 libc-2.11.2.so memcpy 17035100.000 libc-2.11.2.so memcpy [self] Not that helpful. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company AtProcExit_Buffers-v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] JDBC and Binary protocol error, for some statements
Text format is marked as 0, binary format is 1. Sorry--I misread the docs. This is consistent and something does look fishy. Thanks for the tarball. I can take a look tonight. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 01:20:36PM -0500, Robert Haas wrote: On Wed, Nov 24, 2010 at 1:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: OK, patch attached. Two comments: Revised patch attached. I tried configuring oprofile with --callgraph=10 and then running oprofile with -c, but it gives kooky looking output I can't interpret. For example: 642.8571 postgres record_in 857.1429 postgres pg_perm_setlocale 17035 5.7219 libc-2.11.2.so memcpy 17035100.000 libc-2.11.2.so memcpy [self] Not that helpful. :-( Have a look at the wiki: http://wiki.postgresql.org/wiki/Profiling_with_OProfile#Additional_analysis Robert Haas Regards, Gerhard Heift signature.asc Description: Digital signature
Re: [HACKERS] profiling connection overhead
On Wednesday 24 November 2010 19:01:32 Robert Haas wrote: Somehow I don't think I'm going to get much further with this without figuring out how to get oprofile to cough up a call graph. I think to do that sensibly you need CFLAGS=-O2 -fno-omit-frame-pointer... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] function(contants) evaluated for every row
Someone offlist reported query slowness because we don't convert function calls with all-constant parameters to be a constants before we start a sequential scan: EXPLAIN SELECT * FROM test WHERE x = to_date('2001-01-01', '-MM-DD') AND x = to_date('2001-01-01', '-MM-DD'); QUERY PLAN --- Seq Scan on test (cost=0.00..58.00 rows=12 width=4) Filter: ((x = to_date('2001-01-01'::text, '-MM-DD'::text)) AND (x = to_date('2001-01-01'::text, '-MM-DD'::text))) (2 rows) Notice the to_date()'s were not converted to constants in EXPLAIN so they are evaluated for every row. to_date() is marked STABLE. Is this something we should improve? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function(contants) evaluated for every row
Bruce Momjian br...@momjian.us writes: Notice the to_date()'s were not converted to constants in EXPLAIN so they are evaluated for every row. to_date() is marked STABLE. Is this something we should improve? No. This is per expectation. Only IMMUTABLE functions can be folded to constants in advance of the query. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function(contants) evaluated for every row
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Notice the to_date()'s were not converted to constants in EXPLAIN so they are evaluated for every row. to_date() is marked STABLE. Is this something we should improve? No. This is per expectation. Only IMMUTABLE functions can be folded to constants in advance of the query. Well CREATE FUNCTION says about STABLE: STABLE indicates that the function cannot modify the database, and that within a single table scan it will consistently return the same result for the same argument values, but that its result could change across SQL statements. This is the appropriate selection for functions whose results depend on database lookups, parameter variables (such as the current time zone), etc. (It is inappropriate for AFTER triggers that wish to query rows modified by the current command.) Also note that the current_timestamp family of functions qualify as stable, since their values do not change within a transaction. I realize they can't be converted to constants before the query starts but is there a reason we can't convert those functions to constants in the executor before a table scan? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Gerhard Heift ml-postgresql-20081012-3...@gheift.de writes: On Wed, Nov 24, 2010 at 01:20:36PM -0500, Robert Haas wrote: I tried configuring oprofile with --callgraph=10 and then running oprofile with -c, but it gives kooky looking output I can't interpret. Have a look at the wiki: http://wiki.postgresql.org/wiki/Profiling_with_OProfile#Additional_analysis The critical piece of information is there now, but it wasn't a minute ago. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions, this time with a patch
Itagaki Takahiro itagaki.takah...@gmail.com writes: RECOVERY_COMMAND_FILE is opened twice in the patch. The first time is for checking the existence, and the second time is for parsing. Instead of the repeat, how about adding FILE* version of parser? It will be also called from ParseConfigFile() as a sub routine. bool ParseConfigFd(FILE *fd, const char *config_file, int depth, ...) Something like the attached, version 5 of the patch? I've been using the function name ParseConfigFp because the internal parameter was called fp in the previous function body. I suppose that could easily be changed at commit time if necessary. BTW, the parser supports include and custom_variable_classes not only for postgresql.conf but also for all files. Is it an intended behavior? I think they are harmless, so we don't have to change the codes; include might be useful even in recovery.conf, and custom_variable_classes will be unrecognized recovery parameter error after all. Extensions will need the support for custom_variable_classes as it is done now, and as you say, the recovery will just error out. You have to clean your recovery.conf file then try again (I just tried and confirm). I personally don't see any harm to have the features in all currently known uses-cases, and I don't see any point in walking an extra mile here to remove a feature in some cases. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** *** 5024,5200 str_time(pg_time_t tnow) } /* - * Parse one line from recovery.conf. 'cmdline' is the raw line from the - * file. If the line is parsed successfully, returns true, false indicates - * syntax error. On success, *key_p and *value_p are set to the parameter - * name and value on the line, respectively. If the line is an empty line, - * consisting entirely of whitespace and comments, function returns true - * and *keyp_p and *value_p are set to NULL. - * - * The pointers returned in *key_p and *value_p point to an internal buffer - * that is valid only until the next call of parseRecoveryCommandFile(). - */ - static bool - parseRecoveryCommandFileLine(char *cmdline, char **key_p, char **value_p) - { - char *ptr; - char *bufp; - char *key; - char *value; - static char *buf = NULL; - - *key_p = *value_p = NULL; - - /* - * Allocate the buffer on first use. It's used to hold both the parameter - * name and value. - */ - if (buf == NULL) - buf = malloc(MAXPGPATH + 1); - bufp = buf; - - /* Skip any whitespace at the beginning of line */ - for (ptr = cmdline; *ptr; ptr++) - { - if (!isspace((unsigned char) *ptr)) - break; - } - /* Ignore empty lines */ - if (*ptr == '\0' || *ptr == '#') - return true; - - /* Read the parameter name */ - key = bufp; - while (*ptr !isspace((unsigned char) *ptr) - *ptr != '=' *ptr != '\'') - *(bufp++) = *(ptr++); - *(bufp++) = '\0'; - - /* Skip to the beginning quote of the parameter value */ - ptr = strchr(ptr, '\''); - if (!ptr) - return false; - ptr++; - - /* Read the parameter value to *bufp. Collapse any '' escapes as we go. */ - value = bufp; - for (;;) - { - if (*ptr == '\'') - { - ptr++; - if (*ptr == '\'') - *(bufp++) = '\''; - else - { - /* end of parameter */ - *bufp = '\0'; - break; - } - } - else if (*ptr == '\0') - return false; /* unterminated quoted string */ - else - *(bufp++) = *ptr; - - ptr++; - } - *(bufp++) = '\0'; - - /* Check that there's no garbage after the value */ - while (*ptr) - { - if (*ptr == '#') - break; - if (!isspace((unsigned char) *ptr)) - return false; - ptr++; - } - - /* Success! */ - *key_p = key; - *value_p = value; - return true; - } - - /* * See if there is a recovery command file (recovery.conf), and if so * read in parameters for archive recovery and XLOG streaming. * ! * XXX longer term intention is to expand this to ! * cater for additional parameters and controls ! * possibly use a flex lexer similar to the GUC one */ static void readRecoveryCommandFile(void) { FILE *fd; - char cmdline[MAXPGPATH]; TimeLineID rtli = 0; bool rtliGiven = false; ! bool syntaxError = false; fd = AllocateFile(RECOVERY_COMMAND_FILE, r); if (fd == NULL) { if (errno == ENOENT) ! return;/* not there, so no archive recovery */ ereport(FATAL, (errcode_for_file_access(), errmsg(could not open recovery command file \%s\: %m, RECOVERY_COMMAND_FILE))); } ! /* ! * Parse the file... ! */ ! while (fgets(cmdline, sizeof(cmdline), fd) != NULL) ! { ! char *tok1; ! char *tok2; ! if (!parseRecoveryCommandFileLine(cmdline, tok1, tok2)) ! { ! syntaxError = true; ! break; ! } ! if (tok1 == NULL) ! continue; ! ! if
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: Revised patch attached. The asserts in AtProcExit_LocalBuffers are a bit pointless since you forgot to remove the code that forcibly zeroes LocalRefCount[]... otherwise +1. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: Full results, and call graph, attached. The first obvious fact is that most of the memset overhead appears to be coming from InitCatCache. AFAICT that must be the palloc0 calls that are zeroing out (mostly) the hash bucket headers. I don't see any real way to make that cheaper other than to cut the initial sizes of the hash tables (and add support for expanding them later, which is lacking in catcache ATM). Not convinced that that creates any net savings --- it might just save some cycles at startup in exchange for more cycles later, in typical backend usage. Making those hashtables expansible wouldn't be a bad thing in itself, mind you. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function(contants) evaluated for every row
Bruce Momjian br...@momjian.us writes: I realize they can't be converted to constants before the query starts but is there a reason we can't convert those functions to constants in the executor before a table scan? Other than the significant number of cycles that would be wasted (in most cases) checking for the possibility, probably not. I'm dubious that it would average out to a win though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation
On mån, 2010-11-22 at 11:58 +0900, Itagaki Takahiro wrote: * Did you see any performance regression by collation? I found a bug in lc_collate_is_c(); result = 0 should be checked before any other checks. SearchSysCache1() here would be a performance regression. That code turned out to be buggy anyway, because it was using the result cache variable independent of the collation parameter. I did some profiling with this now. The problem is that this function lc_collate_is_c() would need to cache the C-ness property for any number of collations. Depending on what call pattern you expect or want to optimize for, you might end up caching most of the pg_collation catalog, which is actually the mandate of SearchSysCache, but the profile shows that SearchSysCache takes a large chunk of the additional run time. If I remove that branch altogether, that is, don't treat the C locale specially at all in the nondefault collation case, then using non-C locales as nondefault collation is almost as fast as using non-C locales as default location. However, using the C locale as a nondefault collation would then be quite slow (still faster that non-C locales). The solution would perhaps be a custom, lightweight caching system, but I haven't thought of one yet. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function(contants) evaluated for every row
On Wed, Nov 24, 2010 at 21:52, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: Notice the to_date()'s were not converted to constants in EXPLAIN so they are evaluated for every row. to_date() is marked STABLE. No. This is per expectation. Only IMMUTABLE functions can be folded to constants in advance of the query. This is something that has bit me in the past. I realize that STABLE functions cannot be constant-folded at planning-time. But are there good reasons why it cannot called only once at execution-time? As long as *only* STABLE or IMMUTABLE functions are used in a query, we can assume that settings like timezone won't change in the middle of the execution of a function, thus STABLE function calls can be collapsed -- right? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation, work in progress
On tor, 2010-10-14 at 22:54 -0400, Robert Haas wrote: It seems you've falsified the header comment in pathkeys_useful_for_merging(), although I guess it's already false because it doesn't seem to have been updated for the NULLS ASC/DESC stuff, and the interior comment in right_merge_direction() also needs adjusting. But this might be more than a documentation problem, because the choice of merge direction really *is* arbitrary in the case of ASC/DESC and NULLS FIRST/LAST, but I'm not sure whether that's actually true for collation. If collation affects the definition of equality then it certainly isn't true. I did check that again and didn't arrive at the conclusion that the comments would need updating either with respect to this patch or some previous change. Could you check again and possibly provide a suggestion? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation, work in progress
On ons, 2010-09-22 at 19:44 +0900, Itagaki Takahiro wrote: * CREATE TABLE (LIKE table_with_collation) doesn't inherit collations. This was fixed in the CF2010-11 patch. * psql \d needs a separator between collate and not null modifiers. And this as well. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Full results, and call graph, attached. The first obvious fact is that most of the memset overhead appears to be coming from InitCatCache. AFAICT that must be the palloc0 calls that are zeroing out (mostly) the hash bucket headers. I don't see any real way to make that cheaper other than to cut the initial sizes of the hash tables (and add support for expanding them later, which is lacking in catcache ATM). Not convinced that that creates any net savings --- it might just save some cycles at startup in exchange for more cycles later, in typical backend usage. Making those hashtables expansible wouldn't be a bad thing in itself, mind you. The idea I had was to go the other way and say, hey, if these hash tables can't be expanded anyway, let's put them on the BSS instead of heap-allocating them. Any new pages we request from the OS will be zeroed anyway, but with palloc we then have to re-zero the allocated block anyway because palloc can return a memory that's been used, freed, and reused. However, for anything that only needs to be allocated once and never freed, and whose size can be known at compile time, that's not an issue. In fact, it wouldn't be that hard to relax the known at compile time constraint either. We could just declare: char lotsa_zero_bytes[NUM_ZERO_BYTES_WE_NEED]; ...and then peel off chunks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wednesday 24 November 2010 21:47:32 Robert Haas wrote: On Wed, Nov 24, 2010 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Full results, and call graph, attached. The first obvious fact is that most of the memset overhead appears to be coming from InitCatCache. AFAICT that must be the palloc0 calls that are zeroing out (mostly) the hash bucket headers. I don't see any real way to make that cheaper other than to cut the initial sizes of the hash tables (and add support for expanding them later, which is lacking in catcache ATM). Not convinced that that creates any net savings --- it might just save some cycles at startup in exchange for more cycles later, in typical backend usage. Making those hashtables expansible wouldn't be a bad thing in itself, mind you. The idea I had was to go the other way and say, hey, if these hash tables can't be expanded anyway, let's put them on the BSS instead of heap-allocating them. Any new pages we request from the OS will be zeroed anyway, but with palloc we then have to re-zero the allocated block anyway because palloc can return a memory that's been used, freed, and reused. However, for anything that only needs to be allocated once and never freed, and whose size can be known at compile time, that's not an issue. In fact, it wouldn't be that hard to relax the known at compile time constraint either. We could just declare: char lotsa_zero_bytes[NUM_ZERO_BYTES_WE_NEED]; ...and then peel off chunks. Won't this just cause loads of additional pagefaults after fork() when those pages are used the first time and then a second time when first written to (to copy it)? Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 3:53 PM, Andres Freund and...@anarazel.de wrote: On Wednesday 24 November 2010 21:47:32 Robert Haas wrote: On Wed, Nov 24, 2010 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Full results, and call graph, attached. The first obvious fact is that most of the memset overhead appears to be coming from InitCatCache. AFAICT that must be the palloc0 calls that are zeroing out (mostly) the hash bucket headers. I don't see any real way to make that cheaper other than to cut the initial sizes of the hash tables (and add support for expanding them later, which is lacking in catcache ATM). Not convinced that that creates any net savings --- it might just save some cycles at startup in exchange for more cycles later, in typical backend usage. Making those hashtables expansible wouldn't be a bad thing in itself, mind you. The idea I had was to go the other way and say, hey, if these hash tables can't be expanded anyway, let's put them on the BSS instead of heap-allocating them. Any new pages we request from the OS will be zeroed anyway, but with palloc we then have to re-zero the allocated block anyway because palloc can return a memory that's been used, freed, and reused. However, for anything that only needs to be allocated once and never freed, and whose size can be known at compile time, that's not an issue. In fact, it wouldn't be that hard to relax the known at compile time constraint either. We could just declare: char lotsa_zero_bytes[NUM_ZERO_BYTES_WE_NEED]; ...and then peel off chunks. Won't this just cause loads of additional pagefaults after fork() when those pages are used the first time and then a second time when first written to (to copy it)? Aren't we incurring those page faults anyway, for whatever memory palloc is handing out? The heap is no different from bss; we just move the pointer with sbrk(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Regex code versus Unicode chars beyond codepoint 255
Bug #5766 points out that we're still not there yet in terms of having sane behavior for locale-specific regex operations in Unicode encoding. The reason it's not working is that regc_locale does this to expand the set of characters that are considered to match [[:alnum:]] : /* * Now compute the character class contents. * * For the moment, assume that only char codes 256 can be in these * classes. */ ... case CC_ALNUM: cv = getcvec(v, UCHAR_MAX, 0); if (cv) { for (i = 0; i = UCHAR_MAX; i++) { if (pg_wc_isalnum((chr) i)) addchr(cv, (chr) i); } } break; This is a leftover from when we weren't trying to behave sanely for multibyte encodings. Now that we are, it's clearly not good enough. But iterating up to many thousands in this loop isn't too appetizing from a performance standpoint. I looked at the equivalent place in Tcl, and I see that what they're currently doing is they have a hard-wired list of all the Unicode code points that are classified as alnum, punct, etc. We could duplicate that (and use it only if encoding is UTF8), but it seems kind of ugly, and it doesn't respect the idea that the locale setting ought to control which characters are considered to be in each class. Another possibility is to take those lists but apply iswpunct() and friends to the values, including only code points that pass in the finished set. So what you get is the intersection of the Unicode list and the locale behavior. Some of the performance pressure could be taken off if we cached the results instead of recomputing them every time a regex uses the character classification; but I'm not sure how much that would save. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 3:53 PM, Andres Freund and...@anarazel.de wrote: The idea I had was to go the other way and say, hey, if these hash tables can't be expanded anyway, let's put them on the BSS instead of heap-allocating them. Won't this just cause loads of additional pagefaults after fork() when those pages are used the first time and then a second time when first written to (to copy it)? Aren't we incurring those page faults anyway, for whatever memory palloc is handing out? The heap is no different from bss; we just move the pointer with sbrk(). I think you're missing the real point, which that the cost you're measuring here probably isn't so much memset() as faulting in large chunks of address space. Avoiding the explicit memset() likely will save little in real runtime --- it'll just make sure the initial-touch costs are more distributed and harder to measure. But in any case I think this idea is a nonstarter because it gets in the way of making those hashtables expansible, which we *do* need to do eventually. (You might be able to confirm or disprove this theory if you ask oprofile to count memory access stalls instead of CPU clock cycles...) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wednesday 24 November 2010 21:54:53 Robert Haas wrote: On Wed, Nov 24, 2010 at 3:53 PM, Andres Freund and...@anarazel.de wrote: On Wednesday 24 November 2010 21:47:32 Robert Haas wrote: On Wed, Nov 24, 2010 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Full results, and call graph, attached. The first obvious fact is that most of the memset overhead appears to be coming from InitCatCache. AFAICT that must be the palloc0 calls that are zeroing out (mostly) the hash bucket headers. I don't see any real way to make that cheaper other than to cut the initial sizes of the hash tables (and add support for expanding them later, which is lacking in catcache ATM). Not convinced that that creates any net savings --- it might just save some cycles at startup in exchange for more cycles later, in typical backend usage. Making those hashtables expansible wouldn't be a bad thing in itself, mind you. The idea I had was to go the other way and say, hey, if these hash tables can't be expanded anyway, let's put them on the BSS instead of heap-allocating them. Any new pages we request from the OS will be zeroed anyway, but with palloc we then have to re-zero the allocated block anyway because palloc can return a memory that's been used, freed, and reused. However, for anything that only needs to be allocated once and never freed, and whose size can be known at compile time, that's not an issue. In fact, it wouldn't be that hard to relax the known at compile time constraint either. We could just declare: char lotsa_zero_bytes[NUM_ZERO_BYTES_WE_NEED]; ...and then peel off chunks. Won't this just cause loads of additional pagefaults after fork() when those pages are used the first time and then a second time when first written to (to copy it)? Aren't we incurring those page faults anyway, for whatever memory palloc is handing out? The heap is no different from bss; we just move the pointer with sbrk(). Yes, but only once. Also scrubbing a page is faster than copying it... (and there were patches floating around to do that in advance, not sure if they got integrated into mainline linux) Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling pgbench
Robert Haas robertmh...@gmail.com writes: I did some profiling of pgbench -j 36 -c 36 -T 500 banging on my two-core desktop box - with synchronous_commit turned off to keep the fsyncs from dominating the profile - and got these results: 29634 4.7124 postgres base_yyparse Seems like pgbench is a poster child for the value of prepared statements. Have you tried it with -M prepared? I'd take this with a grain of salt as to whether it's representative of real applications, of course. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling pgbench
On Wednesday 24 November 2010 21:24:43 Robert Haas wrote: I'd like to get access to a box with (a lot) more cores, to see whether the lock stuff moves up in the profile. A big chunk of that hash_search_with_hash_value overhead is coming from LockAcquireExtended. The __strcmp_sse2 is almost entirely parsing overhead. In general, I'm not sure there's much hope for reducing the parsing overhead, although ScanKeywordLookup() can certainly be done better. XLogInsert() is spending a lot of time doing CRC's. LWLockAcquire() is dropping cycles in many different places. I can get you profiles of machines with up two 24 real cores, unfortunately I can't give access away. Regarding CRCs: I spent some time optimizing these, as you might remember. The wall I hit optimizing it benefit-wise is that the single CRC calls (4 for a non-indexed single-row insert on a table with 1 column inside a transaction) are just too damn small to get more efficient. Its causing pipeline stalls all over... (21, 5, 1, 28 bytes). I have a very preliminary patch calculating the CRC over the whole thing in one go if it can do so (no switch, no xl buffers wraparound), but its highly ugly as it needs to read from the xl insert buffers and then reinsert the crc at the correct position. While it shows a noticable improvement, that doesn't seem to be a good way to go. It could be made to work properly though. I played around with some ideas to do that more nicely, but none were gratifying. Recarding LWLockAcquire costs: Yes, its pretty noticeable - on loads of different usages. On a bunch of production machines its the second (begind XLogInsert) on some the most expensive function. Most of the time All of those machines are Nehalems though, so the image may be a bit distorted. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Nov 24, 2010, at 4:05 PM, Andres Freund and...@anarazel.de wrote: Won't this just cause loads of additional pagefaults after fork() when those pages are used the first time and then a second time when first written to (to copy it)? Aren't we incurring those page faults anyway, for whatever memory palloc is handing out? The heap is no different from bss; we just move the pointer with sbrk(). Yes, but only once. Also scrubbing a page is faster than copying it... (and there were patches floating around to do that in advance, not sure if they got integrated into mainline linux) I'm not following - can you elaborate? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling pgbench
On Wednesday 24 November 2010 22:14:04 Andres Freund wrote: On Wednesday 24 November 2010 21:24:43 Robert Haas wrote: I'd like to get access to a box with (a lot) more cores, to see whether the lock stuff moves up in the profile. A big chunk of that hash_search_with_hash_value overhead is coming from LockAcquireExtended. The __strcmp_sse2 is almost entirely parsing overhead. In general, I'm not sure there's much hope for reducing the parsing overhead, although ScanKeywordLookup() can certainly be done better. XLogInsert() is spending a lot of time doing CRC's. LWLockAcquire() is dropping cycles in many different places. I can get you profiles of machines with up two 24 real cores, unfortunately I can't give access away. Regarding CRCs: I spent some time optimizing these, as you might remember. The wall I hit optimizing it benefit-wise is that the single CRC calls (4 for a non-indexed single-row insert on a table with 1 column inside a transaction) are just too damn small to get more efficient. Its causing pipeline stalls all over... (21, 5, 1, 28 bytes). I have a very preliminary patch calculating the CRC over the whole thing in one go if it can do so (no switch, no xl buffers wraparound), but its highly ugly as it needs to read from the xl insert buffers and then reinsert the crc at the correct position. While it shows a noticable improvement, that doesn't seem to be a good way to go. It could be made to work properly though. I played around with some ideas to do that more nicely, but none were gratifying. Recarding LWLockAcquire costs: Yes, its pretty noticeable - on loads of different usages. On a bunch of production machines its the second (begind XLogInsert) on some the most expensive function. Most of the time AllocSetAlloc is the third, battling with hash_search_with_hash value. To complete that sentence... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: On Nov 24, 2010, at 4:05 PM, Andres Freund and...@anarazel.de wrote: Yes, but only once. Also scrubbing a page is faster than copying it... (and there were patches floating around to do that in advance, not sure if they got integrated into mainline linux) I'm not following - can you elaborate? I think Andres is saying that bss space isn't optimized during a fork operation: it'll be propagated to the child as copy-on-write pages. Dunno if that's true or not, but if it is, it'd be a good reason to avoid the scheme you're suggesting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wednesday 24 November 2010 22:18:08 Robert Haas wrote: On Nov 24, 2010, at 4:05 PM, Andres Freund and...@anarazel.de wrote: Won't this just cause loads of additional pagefaults after fork() when those pages are used the first time and then a second time when first written to (to copy it)? Aren't we incurring those page faults anyway, for whatever memory palloc is handing out? The heap is no different from bss; we just move the pointer with sbrk(). Yes, but only once. Also scrubbing a page is faster than copying it... (and there were patches floating around to do that in advance, not sure if they got integrated into mainline linux) I'm not following - can you elaborate? When forking the memory mapping of the process is copied - the actual pages are not. When a page is first accessed the page fault handler will setup a mapping to the old page and mark it as shared. When now written to it will fault again and copy the page. In contrast if you access a page the first time after an sbrk (or mmap, doesn't matter) a new page will get scrubbed and and a mapping will get setup. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wednesday 24 November 2010 22:25:45 Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Nov 24, 2010, at 4:05 PM, Andres Freund and...@anarazel.de wrote: Yes, but only once. Also scrubbing a page is faster than copying it... (and there were patches floating around to do that in advance, not sure if they got integrated into mainline linux) I'm not following - can you elaborate? I think Andres is saying that bss space isn't optimized during a fork operation: it'll be propagated to the child as copy-on-write pages. Dunno if that's true or not, but if it is, it'd be a good reason to avoid the scheme you're suggesting. Afair nearly all pages are propagated with copy-on-write semantics. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 4:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: (You might be able to confirm or disprove this theory if you ask oprofile to count memory access stalls instead of CPU clock cycles...) I don't see an event for that. # opcontrol --list-events | grep STALL INSTRUCTION_FETCH_STALL: (counter: all) DISPATCH_STALLS: (counter: all) DISPATCH_STALL_FOR_BRANCH_ABORT: (counter: all) DISPATCH_STALL_FOR_SERIALIZATION: (counter: all) DISPATCH_STALL_FOR_SEGMENT_LOAD: (counter: all) DISPATCH_STALL_FOR_REORDER_BUFFER_FULL: (counter: all) DISPATCH_STALL_FOR_RESERVATION_STATION_FULL: (counter: all) DISPATCH_STALL_FOR_FPU_FULL: (counter: all) DISPATCH_STALL_FOR_LS_FULL: (counter: all) DISPATCH_STALL_WAITING_FOR_ALL_QUIET: (counter: all) DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RESYNC: (counter: all) # opcontrol --list-events | grep MEMORY MEMORY_REQUESTS: (counter: all) MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS: (counter: all) MEMORY_CONTROLLER_SLOT_MISSED: (counter: all) MEMORY_CONTROLLER_TURNAROUNDS: (counter: all) MEMORY_CONTROLLER_BYPASS_COUNTER_SATURATION: (counter: all) CPU_IO_REQUESTS_TO_MEMORY_IO: (counter: all) MEMORY_CONTROLLER_REQUESTS: (counter: all) Ideas? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 4:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: (You might be able to confirm or disprove this theory if you ask oprofile to count memory access stalls instead of CPU clock cycles...) I don't see an event for that. You probably want something involving cache misses. The event names vary depending on just which CPU you've got. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] duplicate connection failure messages
Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I assume you are suggesting to use our inet_net_ntop() even if the system has inet_ntop(). If you're going to have code to do the former, it doesn't seem to be worth the trouble to also have code that does the latter ... OK, we will not call inet_ntop() at all. I moved the CIDR part of adt/inet_net_ntop.c into adt/inet_cidr_ntop.c, and moved the remaining net part to /port/inet_net_ntop.c. I then changed all uses of inet_ntoa to use inet_net_ntop(). While this churn would perhaps not be warranted just to allow for better error messages, I found pg_getaddrinfo_all() being called from libpq::connectDBStart(), which makes it not thread-safe. I am not excited about backpatching it but it is a threading bug. The output is as expected: $ psql -h localhost test psql: could not connect to server: Connection refused Is the server running on host localhost (127.0.0.1) and accepting TCP/IP connections on port 5432? $ psql -h 127.0.0.1 test psql: could not connect to server: Connection refused Is the server running on host 127.0.0.1 and accepting TCP/IP connections on port 5432? Applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wednesday 24 November 2010 23:03:48 Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 4:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: (You might be able to confirm or disprove this theory if you ask oprofile to count memory access stalls instead of CPU clock cycles...) I don't see an event for that. You probably want something involving cache misses. The event names vary depending on just which CPU you've got. Or some BUS OUTSTANDING event. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 5:15 PM, Andres Freund and...@anarazel.de wrote: On Wednesday 24 November 2010 23:03:48 Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Nov 24, 2010 at 4:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: (You might be able to confirm or disprove this theory if you ask oprofile to count memory access stalls instead of CPU clock cycles...) I don't see an event for that. You probably want something involving cache misses. The event names vary depending on just which CPU you've got. Or some BUS OUTSTANDING event. I don't see anything for BUS OUTSTANDING. For CACHE and MISS I have some options: # opcontrol --list-events | grep CACHE DATA_CACHE_ACCESSES: (counter: all) DATA_CACHE_MISSES: (counter: all) DATA_CACHE_REFILLS_FROM_L2_OR_NORTHBRIDGE: (counter: all) DATA_CACHE_REFILLS_FROM_NORTHBRIDGE: (counter: all) DATA_CACHE_LINES_EVICTED: (counter: all) LOCKED_INSTRUCTIONS_DCACHE_MISSES: (counter: all) L2_CACHE_MISS: (counter: all) L2_CACHE_FILL_WRITEBACK: (counter: all) INSTRUCTION_CACHE_FETCHES: (counter: all) INSTRUCTION_CACHE_MISSES: (counter: all) INSTRUCTION_CACHE_REFILLS_FROM_L2: (counter: all) INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM: (counter: all) INSTRUCTION_CACHE_VICTIMS: (counter: all) INSTRUCTION_CACHE_INVALIDATED: (counter: all) CACHE_BLOCK_COMMANDS: (counter: all) READ_REQUEST_L3_CACHE: (counter: all) L3_CACHE_MISSES: (counter: all) IBS_FETCH_ICACHE_MISSES: (ext: ibs_fetch) IBS_FETCH_ICACHE_HITS: (ext: ibs_fetch) IBS_OP_DATA_CACHE_MISS: (ext: ibs_op) IBS_OP_NB_LOCAL_CACHE: (ext: ibs_op) IBS_OP_NB_REMOTE_CACHE: (ext: ibs_op) IBS_OP_NB_CACHE_MODIFIED: (ext: ibs_op) IBS_OP_NB_CACHE_OWNED: (ext: ibs_op) IBS_OP_NB_LOCAL_CACHE_LAT: (ext: ibs_op) IBS_OP_NB_REMOTE_CACHE_LAT: (ext: ibs_op) # opcontrol --list-events | grep MISS | grep -v CACHE L1_DTLB_MISS_AND_L2_DTLB_HIT: (counter: all) L1_DTLB_AND_L2_DTLB_MISS: (counter: all) L1_ITLB_MISS_AND_L2_ITLB_HIT: (counter: all) L1_ITLB_MISS_AND_L2_ITLB_MISS: (counter: all) MEMORY_CONTROLLER_SLOT_MISSED: (counter: all) IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_HITS: (ext: ibs_fetch) IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_MISSES: (ext: ibs_fetch) IBS_OP_L1_DTLB_MISS_L2_DTLB_HIT: (ext: ibs_op) IBS_OP_L1_L2_DTLB_MISS: (ext: ibs_op) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling pgbench
On Wed, Nov 24, 2010 at 1:19 PM, Andres Freund and...@anarazel.de wrote: On Wednesday 24 November 2010 22:14:04 Andres Freund wrote: On Wednesday 24 November 2010 21:24:43 Robert Haas wrote: Recarding LWLockAcquire costs: Yes, its pretty noticeable - on loads of different usages. On a bunch of production machines its the second (begind XLogInsert) on some the most expensive function. Most of the time AllocSetAlloc is the third, battling with hash_search_with_hash value. To complete that sentence... I've played a bit with hash_search_with_hash_value and found that most of the time is spent on shared hash tables, not private ones. And the time attributed to it for the shared hash tables mostly seems to be due to the time it takes to fight cache lines away from other CPUs. I suspect the same thing is true of LWLockAcquire. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling pgbench
On Wed, Nov 24, 2010 at 5:33 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Wed, Nov 24, 2010 at 1:19 PM, Andres Freund and...@anarazel.de wrote: On Wednesday 24 November 2010 22:14:04 Andres Freund wrote: On Wednesday 24 November 2010 21:24:43 Robert Haas wrote: Recarding LWLockAcquire costs: Yes, its pretty noticeable - on loads of different usages. On a bunch of production machines its the second (begind XLogInsert) on some the most expensive function. Most of the time AllocSetAlloc is the third, battling with hash_search_with_hash value. To complete that sentence... I've played a bit with hash_search_with_hash_value and found that most of the time is spent on shared hash tables, not private ones. And the time attributed to it for the shared hash tables mostly seems to be due to the time it takes to fight cache lines away from other CPUs. I suspect the same thing is true of LWLockAcquire. How do you get stats on that? How big is a typical cache line on modern CPUs? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling pgbench
On Wednesday 24 November 2010 23:34:46 Robert Haas wrote: On Wed, Nov 24, 2010 at 5:33 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Wed, Nov 24, 2010 at 1:19 PM, Andres Freund and...@anarazel.de wrote: On Wednesday 24 November 2010 22:14:04 Andres Freund wrote: On Wednesday 24 November 2010 21:24:43 Robert Haas wrote: Recarding LWLockAcquire costs: Yes, its pretty noticeable - on loads of different usages. On a bunch of production machines its the second (begind XLogInsert) on some the most expensive function. Most of the time AllocSetAlloc is the third, battling with hash_search_with_hash value. To complete that sentence... I've played a bit with hash_search_with_hash_value and found that most of the time is spent on shared hash tables, not private ones. And the time attributed to it for the shared hash tables mostly seems to be due to the time it takes to fight cache lines away from other CPUs. I suspect the same thing is true of LWLockAcquire. How do you get stats on that? Btw, if you have some recent kernel I would try to use perf - I find the event mappings easier to understand there, but maybe thats just me jumping onto bandwagons. How big is a typical cache line on modern CPUs? for the line size cat /sys/devices/system/cpu/cpu0/cache/index*/coherency_line_size for the overall size cat /sys/devices/system/cpu/cpu0/cache/index*/size Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas robertmh...@gmail.com writes: I don't see anything for BUS OUTSTANDING. For CACHE and MISS I have some options: DATA_CACHE_MISSES: (counter: all) L3_CACHE_MISSES: (counter: all) Those two look promising, though I can't claim to be an expert. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Wed, Nov 24, 2010 at 5:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I don't see anything for BUS OUTSTANDING. For CACHE and MISS I have some options: DATA_CACHE_MISSES: (counter: all) L3_CACHE_MISSES: (counter: all) Those two look promising, though I can't claim to be an expert. OK. Thanksgiving is about to interfere with my access to this machine, but I'll pick this up next week. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Remove useless whitespace at end of lines
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Peter Eisentraut's message of mar nov 23 17:52:18 -0300 2010: Remove useless whitespace at end of lines This was stuck in the moderation queue because of message size limit (30 kB). Is it worth increasing that value? Evidently we should. pgindent and copyright-update commits are likely to be at least this long. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Remove useless whitespace at end of lines
On Wed, Nov 24, 2010 at 23:45, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Peter Eisentraut's message of mar nov 23 17:52:18 -0300 2010: Remove useless whitespace at end of lines This was stuck in the moderation queue because of message size limit (30 kB). Is it worth increasing that value? Evidently we should. pgindent and copyright-update commits are likely to be at least this long. That's twice a year only - I don't see a big problem moderating those when it happens... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Remove useless whitespace at end of lines
Magnus Hagander wrote: On Wed, Nov 24, 2010 at 23:45, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Peter Eisentraut's message of mar nov 23 17:52:18 -0300 2010: Remove useless whitespace at end of lines This was stuck in the moderation queue because of message size limit (30 kB). ?Is it worth increasing that value? Evidently we should. ?pgindent and copyright-update commits are likely to be at least this long. That's twice a year only - I don't see a big problem moderating those when it happens... I do love to see my big pgindent commits come through. You can hear the wind rustling though the code. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling pgbench
Jeff Janes jeff.ja...@gmail.com writes: I've played a bit with hash_search_with_hash_value and found that most of the time is spent on shared hash tables, not private ones. And the time attributed to it for the shared hash tables mostly seems to be due to the time it takes to fight cache lines away from other CPUs. I suspect the same thing is true of LWLockAcquire. That squares with some behavior I've seen. If you run opannotate you often see ridiculously high time percentages attributed to extremely trivial C statements. The explanation seems to be that those places are where chunks of memory are first touched, and have to be pulled into the CPU's cache (and, if in shared memory, pulled away from some other CPU). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Remove useless whitespace at end of lines
Magnus Hagander mag...@hagander.net writes: On Wed, Nov 24, 2010 at 23:45, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@commandprompt.com writes: This was stuck in the moderation queue because of message size limit (30 kB). Is it worth increasing that value? Evidently we should. pgindent and copyright-update commits are likely to be at least this long. That's twice a year only - I don't see a big problem moderating those when it happens... Its not so much the moderation load, as I don't like being blindsided by commits that touch everything in sight. Finding out only when you try to do git push (as indeed happened to me just this afternoon because of this patch) is annoying. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling pgbench
On Wed, Nov 24, 2010 at 2:34 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Nov 24, 2010 at 5:33 PM, Jeff Janes jeff.ja...@gmail.com wrote: I've played a bit with hash_search_with_hash_value and found that most of the time is spent on shared hash tables, not private ones. And the time attributed to it for the shared hash tables mostly seems to be due to the time it takes to fight cache lines away from other CPUs. I suspect the same thing is true of LWLockAcquire. How do you get stats on that? I replicated hash_search_with_hash_value function definition many times with different names, and changed different parts of the code to invoke different function names. (I don't trust profilers to correctly distribute times over call graphs. I think most of them just assume all function calls take the same time, and don't separately measure the time of calls coming from different parts of the code.) Then I took one of them that is heavily used and leaves the hash table unchanged, and had it invoke the same call several times in a row. The profiling confirmed that it was called 3 times more often, but the time spent in it increased by far less than 3 times, I think the increase in time in that function was only 10% or so (and in non-profiling code, the total run time did not increase by a noticeable amount). The only way I could explain this, other than redundant calls being optimized away (which the profiler call-counts disputes), is caching effects. --- a/src/backend/storage/buffer/buf_table.c +++ b/src/backend/storage/buffer/buf_table.c @@ -95,7 +95,21 @@ BufTableLookup(BufferTag *tagPtr, uint32 hashcode) BufferLookupEnt *result; result = (BufferLookupEnt *) - hash_search_with_hash_value(SharedBufHash, + hash_search_with_hash_value5(SharedBufHash, + (void *) tagPtr, + hashcode, + HASH_FIND, + NULL); + + result = (BufferLookupEnt *) + hash_search_with_hash_value5(SharedBufHash, + (void *) tagPtr, + hashcode, + HASH_FIND, + NULL); + + result = (BufferLookupEnt *) + hash_search_with_hash_value5(SharedBufHash, (void *) tagPtr, hashcode, HASH_FIND, How big is a typical cache line on modern CPUs? That I don't know. I'm more of an experimentalist. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] function(contants) evaluated for every row
On Nov 24, 2010, at 15:28 , Marti Raudsepp wrote: On Wed, Nov 24, 2010 at 21:52, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: Notice the to_date()'s were not converted to constants in EXPLAIN so they are evaluated for every row. to_date() is marked STABLE. No. This is per expectation. Only IMMUTABLE functions can be folded to constants in advance of the query. This is something that has bit me in the past. I realize that STABLE functions cannot be constant-folded at planning-time. But are there good reasons why it cannot called only once at execution-time? As long as *only* STABLE or IMMUTABLE functions are used in a query, we can assume that settings like timezone won't change in the middle of the execution of a function, thus STABLE function calls can be collapsed -- right? I've seen this as well be a performance issue, in particular with partitioned tables. Out of habit I now write functions that always cache the value of the function in a variable and use the variable in the actual query to avoid this particular gotcha. Michael Glaesemann grzm seespotcode net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?
Robert Haas wrote: With respect to the syntax itself, I have mixed feelings. On the one hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS precisely because I believe they handle many common cases that people want in real life without much hullabaloo. But, there's clearly some limit to what can reasonably be done this way. At some point, what you really want is some kind of meta-language where you can write things like: IF EXISTS TABLE t1 THEN ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr; END IF; FYI, I have felt this way for a while. IF EXISTS seemed like something that should never have been added as an inline SQL command option; it just crept in, and kept growing. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] duplicate connection failure messages
Excerpts from Bruce Momjian's message of mié nov 24 19:04:30 -0300 2010: Bruce Momjian wrote: OK, we will not call inet_ntop() at all. I moved the CIDR part of adt/inet_net_ntop.c into adt/inet_cidr_ntop.c, and moved the remaining net part to /port/inet_net_ntop.c. Applied. This broke dugong in the ecpg tests. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?
On Wed, Nov 24, 2010 at 4:30 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: With respect to the syntax itself, I have mixed feelings. On the one hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS precisely because I believe they handle many common cases that people want in real life without much hullabaloo. But, there's clearly some limit to what can reasonably be done this way. At some point, what you really want is some kind of meta-language where you can write things like: IF EXISTS TABLE t1 THEN ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr; END IF; FYI, I have felt this way for a while. IF EXISTS seemed like something that should never have been added as an inline SQL command option; it just crept in, and kept growing. Okay, that being the case: would it make sense to have pg_dump emit DO blocks? I have a feeling this might draw fire, but I don't see any reason why the mechanism would not work to more or less equivalent effect. Certainly making dumps harder to use for those who insist on disabling PL/PGSQL is probably a negative side effect, if one can identify this hypothetical class of person. fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Document that a CHECKPOINT before taking a file system snapshot
2010/11/25 Bruce Momjian br...@momjian.us: Document that a CHECKPOINT before taking a file system snapshot can reduce recovery time. I didn't follow that on -hackers, but : * checkpoint take place in the pg_start_backup process (before it releases the hand, so before you can start snapshoting) * we used to issue a checkpoint *before* pg_start_backup to reduce pg_start_backup duration in case you have wal_archiving turn off and full_page_write is off, it reduces a bit the IO contention. (or even if not I hate seing this pg_start_backup taking seconds/minutes to finish) How the checkpoint before will reduce recovery time ? Branch -- master Details --- http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=7276ab5888d85782d988fc297ad2e176c7ad1bca Modified Files -- doc/src/sgml/backup.sgml | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committ...@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling pgbench
On Nov 24, 2010, at 5:49 PM, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Janes jeff.ja...@gmail.com writes: I've played a bit with hash_search_with_hash_value and found that most of the time is spent on shared hash tables, not private ones. And the time attributed to it for the shared hash tables mostly seems to be due to the time it takes to fight cache lines away from other CPUs. I suspect the same thing is true of LWLockAcquire. That squares with some behavior I've seen. If you run opannotate you often see ridiculously high time percentages attributed to extremely trivial C statements. The explanation seems to be that those places are where chunks of memory are first touched, and have to be pulled into the CPU's cache (and, if in shared memory, pulled away from some other CPU). Does it hurt that, for example, the BufMappingLocks are consecutive in memory? They appear to be among the more heavily contended locks even on my 2-core box. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] duplicate connection failure messages
Alvaro Herrera wrote: Excerpts from Bruce Momjian's message of nov 24 19:04:30 -0300 2010: Bruce Momjian wrote: OK, we will not call inet_ntop() at all. I moved the CIDR part of adt/inet_net_ntop.c into adt/inet_cidr_ntop.c, and moved the remaining net part to /port/inet_net_ntop.c. Applied. This broke dugong in the ecpg tests. I stopped checking the build page after a few hours, but I see the failure now. I have reviewed the libpq Makefile and I believe I am now properly added port/inet_net_ntop.c. I improved the comments as well. Patch attached and applied. Thansk for the heads-up. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile index b327ee5..74ae79a 100644 *** /tmp/mhssze_Makefile Wed Nov 24 21:42:22 2010 --- src/interfaces/libpq/Makefile Wed Nov 24 21:04:00 2010 *** endif *** 27,39 # Need to recompile any libpgport object files because we need these # object files to use the same compile flags as libpq. If we used # the object files from libpgport, this would not be true on all ! # platforms. LIBS := $(LIBS:-lpgport=) OBJS= fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o \ fe-protocol2.o fe-protocol3.o pqexpbuffer.o pqsignal.o fe-secure.o \ libpq-events.o \ ! md5.o ip.o wchar.o encnames.o noblock.o pgstrcasecmp.o thread.o \ $(filter crypt.o getaddrinfo.o inet_aton.o open.o snprintf.o strerror.o strlcpy.o win32error.o, $(LIBOBJS)) ifeq ($(PORTNAME), cygwin) --- 27,40 # Need to recompile any libpgport object files because we need these # object files to use the same compile flags as libpq. If we used # the object files from libpgport, this would not be true on all ! # platforms. We filter some object files so we only use object ! # files configure says we need. LIBS := $(LIBS:-lpgport=) OBJS= fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o \ fe-protocol2.o fe-protocol3.o pqexpbuffer.o pqsignal.o fe-secure.o \ libpq-events.o \ ! md5.o ip.o wchar.o encnames.o inet_net_ntop.o noblock.o pgstrcasecmp.o thread.o \ $(filter crypt.o getaddrinfo.o inet_aton.o open.o snprintf.o strerror.o strlcpy.o win32error.o, $(LIBOBJS)) ifeq ($(PORTNAME), cygwin) *** backend_src = $(top_srcdir)/src/backend *** 80,86 # For port modules, this only happens if configure decides the module # is needed (see filter hack in OBJS, above). ! crypt.c getaddrinfo.c inet_aton.c noblock.c open.c pgstrcasecmp.c snprintf.c strerror.c strlcpy.c thread.c win32error.c pgsleep.c: % : $(top_srcdir)/src/port/% rm -f $@ $(LN_S) $ . md5.c ip.c: % : $(backend_src)/libpq/% --- 81,87 # For port modules, this only happens if configure decides the module # is needed (see filter hack in OBJS, above). ! crypt.c getaddrinfo.c inet_aton.c inet_net_ntop.c noblock.c open.c pgstrcasecmp.c snprintf.c strerror.c strlcpy.c thread.c win32error.c pgsleep.c: % : $(top_srcdir)/src/port/% rm -f $@ $(LN_S) $ . md5.c ip.c: % : $(backend_src)/libpq/% -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Document that a CHECKPOINT before taking a file system snapshot
C?dric Villemain wrote: 2010/11/25 Bruce Momjian br...@momjian.us: Document that a CHECKPOINT before taking a file system snapshot can reduce recovery time. I didn't follow that on -hackers, but : * checkpoint take place in the pg_start_backup process (before it releases the hand, so before you can start snapshoting) * we used to issue a checkpoint *before* pg_start_backup to reduce pg_start_backup duration in case you have wal_archiving turn off and full_page_write is off, it reduces a bit the IO contention. (or even if not I hate seing this pg_start_backup taking seconds/minutes to finish) How the checkpoint before will reduce recovery time ? This is for using file system snapshots, not PITR or continuous archiving. The checkpoint reduces how much WAL has to be replayed. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?
Daniel Farina wrote: On Wed, Nov 24, 2010 at 4:30 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: With respect to the syntax itself, I have mixed feelings. ?On the one hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS precisely because I believe they handle many common cases that people want in real life without much hullabaloo. ?But, there's clearly some limit to what can reasonably be done this way. ?At some point, what you really want is some kind of meta-language where you can write things like: IF EXISTS TABLE t1 THEN ? ?ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr; END IF; FYI, I have felt this way for a while. ?IF EXISTS seemed like something that should never have been added as an inline SQL command option; it just crept in, and kept growing. Okay, that being the case: would it make sense to have pg_dump emit DO blocks? I have a feeling this might draw fire, but I don't see any reason why the mechanism would not work to more or less equivalent effect. Certainly making dumps harder to use for those who insist on disabling PL/PGSQL is probably a negative side effect, if one can identify this hypothetical class of person. Not being able to recover a dump is serious problem for a user. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?
On Wed, Nov 24, 2010 at 7:03 PM, Bruce Momjian br...@momjian.us wrote: Daniel Farina wrote: On Wed, Nov 24, 2010 at 4:30 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: With respect to the syntax itself, I have mixed feelings. ?On the one hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS precisely because I believe they handle many common cases that people want in real life without much hullabaloo. ?But, there's clearly some limit to what can reasonably be done this way. ?At some point, what you really want is some kind of meta-language where you can write things like: IF EXISTS TABLE t1 THEN ? ?ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr; END IF; FYI, I have felt this way for a while. ?IF EXISTS seemed like something that should never have been added as an inline SQL command option; it just crept in, and kept growing. Okay, that being the case: would it make sense to have pg_dump emit DO blocks? I have a feeling this might draw fire, but I don't see any reason why the mechanism would not work to more or less equivalent effect. Certainly making dumps harder to use for those who insist on disabling PL/PGSQL is probably a negative side effect, if one can identify this hypothetical class of person. Not being able to recover a dump is serious problem for a user. Even if it only involves enabling PLPGSQL to do the restore? Also take into consideration that plpgsql is enabled by default. A user would have to change the template database (which, in general, can cause restores to fail in at least a few other ways) or drop the procedural language explicitly to make that mechanism not work with a fresh and normal-looking createdb. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?
Daniel Farina wrote: On Wed, Nov 24, 2010 at 7:03 PM, Bruce Momjian br...@momjian.us wrote: Daniel Farina wrote: On Wed, Nov 24, 2010 at 4:30 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: With respect to the syntax itself, I have mixed feelings. ?On the one hand, I'm a big fan of CREATE IF NOT EXISTS and DROP IF EXISTS precisely because I believe they handle many common cases that people want in real life without much hullabaloo. ?But, there's clearly some limit to what can reasonably be done this way. ?At some point, what you really want is some kind of meta-language where you can write things like: IF EXISTS TABLE t1 THEN ? ?ALTER TABLE t1 DROP CONSTRAINT IF EXISTS t1_constr; END IF; FYI, I have felt this way for a while. ?IF EXISTS seemed like something that should never have been added as an inline SQL command option; it just crept in, and kept growing. Okay, that being the case: would it make sense to have pg_dump emit DO blocks? I have a feeling this might draw fire, but I don't see any reason why the mechanism would not work to more or less equivalent effect. Certainly making dumps harder to use for those who insist on disabling PL/PGSQL is probably a negative side effect, if one can identify this hypothetical class of person. Not being able to recover a dump is serious problem for a user. Even if it only involves enabling PLPGSQL to do the restore? Also take into consideration that plpgsql is enabled by default. A user would have to change the template database (which, in general, can cause restores to fail in at least a few other ways) or drop the procedural language explicitly to make that mechanism not work with a fresh and normal-looking createdb. What are we adding a pl/pgsql dependency for? What is the benefit that will warrant requiring people who disable plpgsql to enable it for restores? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] duplicate connection failure messages
bruce wrote: Alvaro Herrera wrote: Excerpts from Bruce Momjian's message of nov 24 19:04:30 -0300 2010: Bruce Momjian wrote: OK, we will not call inet_ntop() at all. I moved the CIDR part of adt/inet_net_ntop.c into adt/inet_cidr_ntop.c, and moved the remaining net part to /port/inet_net_ntop.c. Applied. This broke dugong in the ecpg tests. I stopped checking the build page after a few hours, but I see the failure now. I have reviewed the libpq Makefile and I believe I am now properly added port/inet_net_ntop.c. I improved the comments as well. Patch attached and applied. Thansk for the heads-up. OK, dugong is green now, so I think I got it. :-) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... IF EXISTS feature?
On Wed, Nov 24, 2010 at 7:21 PM, Bruce Momjian br...@momjian.us wrote: What are we adding a pl/pgsql dependency for? What is the benefit that will warrant requiring people who disable plpgsql to enable it for restores? There are two use cases I want to cover: 1) It should be possible to restore a dump made with --clean on an empty database without error, so it can be run in a transaction and the error code can be usefully monitored. 2) It should be possible a database be dumped and restored by a non-superuser, again, cleanly, as per 1. It was easy enough to change all the DROP ... statements to DROP ... IF EXISTS, but the ALTER statements have no equivalent, and thus the only way for a dump created with --clean to run without error is to ensure that all table and domain constraints exist prior to restore. The obvious mechanisms that have come to mind in this thread are: * An IF EXISTS variant of ALTER, at least for TABLE and DOMAIN (although it may be strange to only support it on a couple of types) * Use of anonymous-DO code blocks (the prototype uses this, and this depends on plpgsql) * Bizarre things I can imagine doing that involve creative queries that, as a side effect, might drop objects that I have not mentioned because I thought they were too gross to be given serious consideration. But it might be plpgsql-less, which would be nice. Note that in the case where one wants to dump/restore as a non-superuser that one may not be in a position to conveniently do a (DROP|CREATE) DATABASE statement to work around the problem. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation, work in progress
On Wed, Nov 24, 2010 at 3:37 PM, Peter Eisentraut pete...@gmx.net wrote: On tor, 2010-10-14 at 22:54 -0400, Robert Haas wrote: It seems you've falsified the header comment in pathkeys_useful_for_merging(), although I guess it's already false because it doesn't seem to have been updated for the NULLS ASC/DESC stuff, and the interior comment in right_merge_direction() also needs adjusting. But this might be more than a documentation problem, because the choice of merge direction really *is* arbitrary in the case of ASC/DESC and NULLS FIRST/LAST, but I'm not sure whether that's actually true for collation. If collation affects the definition of equality then it certainly isn't true. I did check that again and didn't arrive at the conclusion that the comments would need updating either with respect to this patch or some previous change. Could you check again and possibly provide a suggestion? I think that you are right and that my previous comment was erroneous. Sorry for the noise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tab completion for view triggers in psql
On Tue, Nov 23, 2010 at 10:21 PM, David Fetter da...@fetter.org wrote: Please find attached a patch changing both this and updateable to updatable, also per the very handy git grep I just learned about :) I looked a little more at this patch today. I didn't find any serious problems, though it would have helped me (and maybe other reviewers) to have a comprehensive list of the SQL statements which the patch implements autocompletion for. Not sure how hard this would be to add in, but the following gets autocompleted with an INSERT: CREATE TRIGGER mytrg BEFORE IN[TAB] while the following doesn't find any autocompletions: CREATE TRIGGER mytrg BEFORE INSERT OR UP[TAB] Also, this existed before the patch, but this SQL: CREATE TRIGGER mytrg INSTEAD OF INSERT ON [TAB] autocompletes to: CREATE TRIGGER mytrg INSTEAD OF INSERT ON SET not sure how that SET is getting in there -- some incorrect tab completion match? Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security hooks on object creation
The attached patch is a revised patch. - The utils/hooks.h was renamed to catalog/objectaccess.h - Numeric in the tail of InvokeObjectAccessHook0() has gone. - Fixed bug in ATExecAddColumn; it gave AttributeRelationId to the hook instead of RelationRelationId. In addition, I found that we didn't put post-creation hook on foreign data wrapper, foreign server and user mapping exceptionally. So, I put this hook around their command handler like any other object classes. Thanks, (2010/11/24 12:07), Robert Haas wrote: 2010/11/23 KaiGai Koheikai...@ak.jp.nec.com: What I'm not quite sure about is where to put the definitions you've added to a new file utils/hooks.h; I don't feel that's a very appropriate location. It's tempting to put them in utils/acl.h just because this is vaguely access-control related and that header is already included in most of the right places, but maybe that's too much of a stretch; or perhaps catalog/catalog.h, although that doesn't feel quite right either. If we are going to add a new header file, I still don't like utils/hooks.h much - it's considerably more generic than can be justified by its contents. I don't think utils/acl.h is long-standing right place, because we intended not to restrict the purpose of this hooks to access controls as you mentioned. I think somewhere under the catalog/ directory is a good idea because it hooks events that user wants (eventually) to modify system catalogs. How about catalog/hooks.h, instead of utils/hooks.h? Well, if we're going to create a new header file for this, I think it should be called something like catalog/objectaccess.h, rather than just hooks.h. But I'd rather reuse something that's already there, all things being equal. -- KaiGai Kohei kai...@ak.jp.nec.com diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index dcc53e1..9a38207 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -40,6 +40,7 @@ #include catalog/index.h #include catalog/indexing.h #include catalog/namespace.h +#include catalog/objectaccess.h #include catalog/pg_attrdef.h #include catalog/pg_constraint.h #include catalog/pg_inherits.h @@ -1188,6 +1189,10 @@ heap_create_with_catalog(const char *relname, } } + /* Post creation of new relation */ + InvokeObjectAccessHook(OAT_POST_CREATE, + RelationRelationId, relid, 0); + /* * Store any supplied constraints and defaults. * diff --git a/src/backend/catalog/pg_constraint.c b/src/backend/catalog/pg_constraint.c index 8b4f8c6..1ed108f 100644 --- a/src/backend/catalog/pg_constraint.c +++ b/src/backend/catalog/pg_constraint.c @@ -18,6 +18,7 @@ #include access/heapam.h #include catalog/dependency.h #include catalog/indexing.h +#include catalog/objectaccess.h #include catalog/pg_constraint.h #include catalog/pg_operator.h #include catalog/pg_type.h @@ -360,6 +361,10 @@ CreateConstraintEntry(const char *constraintName, DEPENDENCY_NORMAL); } + /* Post creation of a new constraint */ + InvokeObjectAccessHook(OAT_POST_CREATE, + ConstraintRelationId, conOid, 0); + return conOid; } diff --git a/src/backend/catalog/pg_conversion.c b/src/backend/catalog/pg_conversion.c index 9578184..853ed0e 100644 --- a/src/backend/catalog/pg_conversion.c +++ b/src/backend/catalog/pg_conversion.c @@ -18,6 +18,7 @@ #include access/sysattr.h #include catalog/dependency.h #include catalog/indexing.h +#include catalog/objectaccess.h #include catalog/pg_conversion.h #include catalog/pg_conversion_fn.h #include catalog/pg_namespace.h @@ -131,6 +132,10 @@ ConversionCreate(const char *conname, Oid connamespace, recordDependencyOnOwner(ConversionRelationId, HeapTupleGetOid(tup), conowner); + /* Post creation of a new conversion */ + InvokeObjectAccessHook(OAT_POST_CREATE, + ConversionRelationId, HeapTupleGetOid(tup), 0); + heap_freetuple(tup); heap_close(rel, RowExclusiveLock); diff --git a/src/backend/catalog/pg_namespace.c b/src/backend/catalog/pg_namespace.c index 71ebd7a..978be51 100644 --- a/src/backend/catalog/pg_namespace.c +++ b/src/backend/catalog/pg_namespace.c @@ -17,6 +17,7 @@ #include access/heapam.h #include catalog/dependency.h #include catalog/indexing.h +#include catalog/objectaccess.h #include catalog/pg_namespace.h #include utils/builtins.h #include utils/rel.h @@ -75,5 +76,9 @@ NamespaceCreate(const char *nspName, Oid ownerId) /* Record dependency on owner */ recordDependencyOnOwner(NamespaceRelationId, nspoid, ownerId); + /* Post creation of new schema */ + InvokeObjectAccessHook(OAT_POST_CREATE, + NamespaceRelationId, nspoid, 0); + return nspoid; } diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c index 73de672..cd169a6 100644 --- a/src/backend/catalog/pg_operator.c +++ b/src/backend/catalog/pg_operator.c @@ -22,6 +22,7 @@ #include catalog/dependency.h #include catalog/indexing.h #include catalog/namespace.h
Re: [HACKERS] Label switcher function
The attached patch is a revised one. It provides two hooks; the one informs core PG whether the supplied function needs to be hooked, or not. the other is an actual hook on prepare, start, end and abort of function invocations. typedef bool (*needs_function_call_type)(Oid fn_oid); typedef void (*function_call_type)(FunctionCallEventType event, FmgrInfo *flinfo, Datum *private); The hook prototype was a bit modified since the suggestion from Robert. Because FmgrInfo structure contain OID of the function, it might be redundant to deliver OID of the function individually. Rest of parts are revised according to the comment. I also fixed up source code comments which might become incorrect. Thanks, (2010/11/18 11:30), Robert Haas wrote: 2010/11/17 KaiGai Koheikai...@ak.jp.nec.com: I revised my patch as I attached. The hook function is modified and consolidated as follows: typedef enum FunctionCallEventType { FCET_BE_HOOKED, FCET_PREPARE, FCET_START, FCET_END, FCET_ABORT, } FunctionCallEventType; typedef Datum (*function_call_event_type)(Oid functionId, FunctionCallEventType event, Datum event_arg); extern PGDLLIMPORT function_call_event_type function_call_event_hook; Unlike the subject of this e-mail, now it does not focus on only switching security labels during execution of a certain functions. For example, we may use this hook to track certain functions for security auditing, performance tuning, and others. In the case of SE-PgSQL, it shall return BoolGetDatum(true), if the target function is configured as a trusted procedure, then, this invocation will be hooked by fmgr_security_definer. In the first call, it shall compute the security context to be assigned during execution on FCET_PREPARE event. Then, it switches to the computed label on the FCET_START event, and restore it on the FCET_END or ECET_ABORT event. This seems like it's a lot simpler than before, which is good. It looks to me as though there should really be two separate hooks, though, one for what is now FCET_BE_HOOKED and one for everything else. For FCET_BE_HOOKED, you want a function that takes an Oid and returns a bool. For the other event types, the functionId and event arguments are OK, but I think you should forget about the save_datum stuff and just always pass fcache-flinfo andfcache-private. The plugin can get the effect of save_datum by passing around whatever state it needs to hold on to using fcache-private. So: bool (*needs_function_call_hook)(Oid fn_oid); void (*function_call_hook)(Oid fn_oid, FunctionCallEventType event, FmgrInfo flinfo, Datum *private); Another general comment is that you've not done a very complete job updating the comments; there are several of them in fmgr.c that are no longer accurate. Also, please zap the unnecessary whitespace changes. Thanks, -- KaiGai Kohei kai...@ak.jp.nec.com contrib/dummy_seclabel/dummy_seclabel.c | 102 - src/backend/optimizer/util/clauses.c |8 ++ src/backend/utils/fmgr/fmgr.c | 44 --- src/include/fmgr.h| 55 + src/test/regress/input/security_label.source | 28 +++ src/test/regress/output/security_label.source | 43 ++- 6 files changed, 267 insertions(+), 13 deletions(-) diff --git a/contrib/dummy_seclabel/dummy_seclabel.c b/contrib/dummy_seclabel/dummy_seclabel.c index 8bd50a3..7acb512 100644 --- a/contrib/dummy_seclabel/dummy_seclabel.c +++ b/contrib/dummy_seclabel/dummy_seclabel.c @@ -12,14 +12,105 @@ */ #include postgres.h +#include catalog/pg_proc.h #include commands/seclabel.h #include miscadmin.h PG_MODULE_MAGIC; +PG_FUNCTION_INFO_V1(dummy_client_label); + /* Entrypoint of the module */ void _PG_init(void); +/* SQL functions */ +Datum dummay_client_label(PG_FUNCTION_ARGS); + +static const char *client_label = unclassified; + +typedef struct { + const char *old_label; + const char *new_label; + Datum next_private; +} dummy_stack; + +static needs_function_call_type needs_function_call_next = NULL; +static function_call_type function_call_next = NULL; + +static bool +dummy_needs_function_call(Oid fn_oid) +{ + char *label; + bool result = false; + ObjectAddress object = { .classId = ProcedureRelationId, + .objectId = fn_oid, + .objectSubId = 0 }; + + if (needs_function_call_next + (*needs_function_call_next)(fn_oid)) + return true; + + label = GetSecurityLabel(object, dummy); + if (label strcmp(label, trusted) == 0) + result = true; + + if (label) + pfree(label); + + return result; +} + +static void +dummy_function_call(FunctionCallEventType event, + FmgrInfo *flinfo, Datum *private) +{ + dummy_stack *stack; + + switch (event) + { + case FCET_PREPARE: +
Re: [HACKERS] Extensions, this time with a patch
On Thu, Nov 25, 2010 at 05:02, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Something like the attached, version 5 of the patch? I've been using the function name ParseConfigFp because the internal parameter was called fp in the previous function body. I suppose that could easily be changed at commit time if necessary. Thanks. I'll move the patch to Ready for Committer. Here is a list of items for committers, including only cosmetic changes. - Comments in recovery.conf.sample needs to be adjusted. # (The quotes around the value are NOT optional, but the = is.) It seems to be the only description about quotes are not omissible before. - We might not need to check the result of ParseConfigFp() because it always raises FATAL on errors. - We could remove the unused argument calling_file in ParseConfigFp(). - I feel struct name_value_pair and ConfigNameValuePair a bit redundant names. I'd prefer something like ConfigVariable. - fp might be a better name than FILE *fd. There are 4 usages in xlog.c. Extensions will need the support for custom_variable_classes as it is done now, and as you say, the recovery will just error out. You have to clean your recovery.conf file then try again (I just tried and confirm). I personally don't see any harm to have the features in all currently known uses-cases, and I don't see any point in walking an extra mile here to remove a feature in some cases. Sure. We will leave them. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib: auth_delay module
(2010/11/19 16:57), KaiGai Kohei wrote: (2010/11/18 2:17), Robert Haas wrote: On Wed, Nov 17, 2010 at 10:32 AM, Ross J. Reedstromreeds...@rice.edu wrote: On Tue, Nov 16, 2010 at 09:41:37PM -0500, Robert Haas wrote: On Tue, Nov 16, 2010 at 8:15 PM, KaiGai Koheikai...@ak.jp.nec.com wrote: If we don't need a PoC module for each new hooks, I'm not strongly motivated to push it into contrib tree. How about your opinion? I'd say let it go, unless someone else feels strongly about it. I would use this module (rate limit new connection attempts) as soon as I could. Putting a cap on potential CPU usage on a production DB by either a blackhat or mistake by a developer caused by a mistake in configuration (leaving the port accessible) is definitely useful, even in the face of max_connections. My production apps already have their connections and seldom need new ones. They all use CPU though. If KaiGai updates the code per previous discussion, would you be willing to take a crack at adding documentation? P.S. Your email client seems to be setting the Reply-To address to a ridiculous value. OK, I'll revise my patch according to the previous discussion. The attached patch is revised version. - Logging part within auth_delay was removed. This module now focuses on injection of a few seconds delay on authentication failed. - Documentation parts were added like any other contrib modules. Thanks, -- KaiGai Kohei kai...@ak.jp.nec.com contrib/Makefile|1 + contrib/README |5 +++ contrib/auth_delay/Makefile | 14 +++ contrib/auth_delay/auth_delay.c | 71 doc/src/sgml/auth-delay.sgml| 76 +++ doc/src/sgml/contrib.sgml |1 + doc/src/sgml/filelist.sgml |1 + 7 files changed, 169 insertions(+), 0 deletions(-) diff --git a/contrib/Makefile b/contrib/Makefile index e1f2a84..5747bcc 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -6,6 +6,7 @@ include $(top_builddir)/src/Makefile.global SUBDIRS = \ adminpack \ + auth_delay \ auto_explain \ btree_gin \ btree_gist \ diff --git a/contrib/README b/contrib/README index 6d29cfe..a6abd94 100644 --- a/contrib/README +++ b/contrib/README @@ -28,6 +28,11 @@ adminpack - File and log manipulation routines, used by pgAdmin by Dave Page dp...@vale-housing.co.uk +auth_delay + Add a few second's delay on authentication failed. It enables to make + difficult brute-force attacks on database passwords. + by KaiGai Kohei kai...@ak.jp.nec.com + auto_explain - Log EXPLAIN output for long-running queries by Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp diff --git a/contrib/auth_delay/Makefile b/contrib/auth_delay/Makefile new file mode 100644 index 000..09d2d54 --- /dev/null +++ b/contrib/auth_delay/Makefile @@ -0,0 +1,14 @@ +# contrib/auth_delay/Makefile + +MODULES = auth_delay + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/auth_delay +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c new file mode 100644 index 000..746ac4b --- /dev/null +++ b/contrib/auth_delay/auth_delay.c @@ -0,0 +1,71 @@ +/* - + * + * auth_delay.c + * + * Copyright (C) 2010, PostgreSQL Global Development Group + * + * IDENTIFICATION + * contrib/auth_delay/auth_delay.c + * + * - + */ +#include postgres.h + +#include libpq/auth.h +#include utils/guc.h +#include utils/timestamp.h + +#include unistd.h + +PG_MODULE_MAGIC; + +void _PG_init(void); + +/* GUC Variables */ +static int auth_delay_seconds; + +/* Original Hook */ +static ClientAuthentication_hook_type original_client_auth_hook = NULL; + +/* + * Check authentication + */ +static void +auth_delay_checks(Port *port, int status) +{ + /* + * Any other plugins which use the ClientAuthentication_hook. + */ + if (original_client_auth_hook) + original_client_auth_hook(port, status); + + /* + * Inject a few seconds delay on authentication failed. + */ + if (status != STATUS_OK) + { + sleep(auth_delay_seconds); + } +} + +/* + * Module Load Callback + */ +void +_PG_init(void) +{ + /* Define custome GUC variables */ + DefineCustomIntVariable(auth_delay.seconds, + Seconds to be delayed on authentication failed, + NULL, + auth_delay_seconds, + 2, + 0, INT_MAX, + PGC_POSTMASTER, + GUC_UNIT_S, + NULL, + NULL); + /* Install Hooks */ + original_client_auth_hook = ClientAuthentication_hook; + ClientAuthentication_hook = auth_delay_checks; +} diff --git a/doc/src/sgml/auth-delay.sgml b/doc/src/sgml/auth-delay.sgml new file mode 100644 index 000..372702d ---
[HACKERS] pg_execute_from_file review
I've just looked at pg_execute_from_file[1]. The idea here is to execute all the SQL commands in a given file. My comments: * It applies well enough, and builds fine * It seems to work, and I've not come up with a way to make it break * It seems useful, and to follow the limited design discussion relevant to it * It looks more or less like the rest of the code base, so far as I know Various questions and/or nits: * I'd like to see the docs slightly expanded, specifically to describe parameter replacement. I wondered for a while if I needed to set of parameters in any specific way, before reading the code and realizing they can be whatever I want. * Does anyone think it needs representation in the test suite? * Is it at all bad to include spi.h in genfile.c? IOW should this function live elsewhere? It seems reasonable to me to do it as written, but I thought I'd ask. * In the snippet below, it seems best just to use palloc0(): query_string = (char *)palloc((fsize+1)*sizeof(char)); memset(query_string, 0, fsize+1); * Shouldn't it include SPI_push() and SPI_pop()? [1] http://archives.postgresql.org/message-id/m262wf6fnz@2ndquadrant.fr -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature