Re: [HACKERS] [PATCH] Cleanup of GUC units code
Le jeudi 04 septembre 2008, Robert Treat a écrit : To paraphrase, if you can't write a config file correctly before restarting, I do not want you anywhere near any instance of a production system Do you really want to TCO of PostgreSQL to raise that much when the software could help lowering it? If you're a shop where you can't have only experts in any given domain do level 1 nightly fix, maybe you'd still like them to be able to follow some procedures involving server config change. On what grounds are you wanting this not to be possible? Regards, -- dim signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] New FSM patch
Thanks for the review! Simon Riggs wrote: Can I check some aspects of this related to Hot Standby? Some of them sound obvious, but worth double checking. * There will be no need to read FSM by any normal operation of a read-only transaction, so locking correctness considerations can possibly be ignored during recovery. Correct. A HOT prune operation doesn't currently update the FSM, but if it did, that would be a case of a read-only transaction updating the FSM. But we can't prune anyway in a hot standby. pg_freespacemap exists though: would we need to prevent that from executing during recovery, or will the FSM be fully readable? i.e. does redo take appropriate locks already (I don't see any Cleanup locks being required). pg_freespacemap, the contrib module? Yeah, the FSM should be fully readable. During normal operation, when a bottom level page is updated, and the update needs to be bubbled up, the upper level page is pinned and locked before the lock on the lower level page is released. That interlocking is not done during WAL replay, and the lock on the lower level page is released before locking the upper page. It's not required during WAL replay, as there's no concurrent updates to the FSM. * FSM will be continuously maintained during recovery, so FSM will now be correct and immediately available when recovery completes? Correct, * There are no cases where a screwed-up FSM will crash either recovery (FATAL+) or halt normal operation (PANIC)? Hmm, there's no explicit elog(FATAL/PANIC) calls, but if the FSM is really corrupt, you can probably get a segfault. That should be fixable by adding more sanity checks, though. * incomplete action cleanup is fairly cheap and doesn't rely on the FSM being searchable to correct the error? This last is a hard one... Correct. Do we have the concept of a invalid/corrupt FSM? What happens if the logic goes wrong and we have a corrupt page? Will that mean we can't complete actions against the heap? Some scenarios I can think of: Scenario: The binary tree on a page is corrupt, so that the value of an upper node is Max(leftchild, rightchild). Consequence: Searchers won't see the free space below that node, and will look elsewhere. Scenario: The binary tree on a page is corrupt, so that the value of an upper node is Max(leftchild, rightchild). Consequence: Searchers will notice the corruption while trying to traverse down that path, and throw an elog(WARNING) in search_avail(). fsm_search will retry the search, and will in worst case go into an infinite loop. That's obviously not good. We could automatically fix the upper nodes of the tree, but that would wipe evidence that would be useful in debugging. Scenario: An upper level page is corrupt, claiming that there's no free space on a lower level page, while there actually is. (the opposite, where an upper level page claims that there *is* free space on a lower level page, while there actually isn't, is now normal. The searcher will update the upper level page in that case) Consequence: A searcher won't see that free space, and will look elsewhere. Scenario: An upper level page is corrupt, claiming that there is free space on a lower level page that doesn't exist. Consequence: Searchers will elog(ERROR), trying to read the non-existent FSM page. The 3rd scenario would lead to heap inserts/updates failing. We could avoid that by checking that the page exists with RelationGetNumberOfBlocks(), but I'm not sure if it's worth the cost. Are there really any changes to these files? src/include/storage/bufmgr.h src/include/postmaster/bgwriter.h Hmm, apparently not. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] StartupCLOG
I notice that StartupCLOG zeroes out entries later than the nextxid when we complete recovery in StartupXLOG, reason given is safety in case we crash. ISTM that we should also do that whenever we see a Shutdown Checkpoint in WAL, since that can be caused by a shutdown immediate, shutdown abort or crash. In each of those cases its possible that we would have the same conditions as exist at the end of WAL. Can't see a failure case of importance though, just wanted to flag it up as a possible. Better a false positive with these types of concern. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Conflict resolution in Multimaster replication(Postgres-R)
Hello Srinivas, M2Y wrote: My basic question is: in multimaster replication, if each site goes ahead and does the modifications issued by the transaction and then sends the writeset to others in the group, how the ACID properties be maintained? Well, there are different approaches. With regard to locking, you can differentiate between pessimistic and optimistic locking. Synchronous replication solutions like two-phase commit and most statement based solutions do pessimistic locking: upon updating a row - and thus locking it - they make sure every other node locks the same row as well before proceeding. Asynchronous replication solutions (including the original Postgres-R algorithm just mentioned by Robert Hodges) do optimistic locking: they proceed with the transaction even if they don't know in advance if the same rows can be locked for that transaction on all other nodes. They optimistically assume it will be possible to lock them in most cases. In all other cases, they abort one of two conflicting transactions. Here you can distinguish even further between eager and lazy solutions: a lazy solution defers the check against conflicting transactions on other nodes to sometime after the commit. Upon detecting such a conflict, it must thus abort an already committed transaction (late abort). That's a violation of the ACID properties, but it's worthwhile in certain cases. On the other hand, the eager solution does the conflict checking and possible aborting *before* committing, thus maintaining full ACID properties. That method then suffers from an increased commit latency, dependent on the network. See [1] for more information. A more general question is: for Transactional isolation level 4(serializable level), the information such as locking of rows be transmitted across sites? If not, what is the mechanism to address concurrency with serializibility. Depends on the algorithm, but transferring every single lock is generally considered to be too expensive. Instead, locks are transferred indirectly by sending statements or changesets or such. What kind of replication are you interested in? Regards Markus Wanner [1]: Terms and Definitions for Database Replication: http://www.postgres-r.org/documentation/terms -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New FSM patch
On Thu, 2008-09-04 at 11:07 +0300, Heikki Linnakangas wrote: Thanks for the review! Not as thorough as I would have liked, I must admit. Thanks for the other confirmations. Scenario: The binary tree on a page is corrupt, so that the value of an upper node is Max(leftchild, rightchild). Consequence: Searchers will notice the corruption while trying to traverse down that path, and throw an elog(WARNING) in search_avail(). fsm_search will retry the search, and will in worst case go into an infinite loop. That's obviously not good. We could automatically fix the upper nodes of the tree, but that would wipe evidence that would be useful in debugging. We probably need to break out of infinite loops, especially ones that output warning messages on each loop. :-) -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Window functions patch v04 for the September commit fest
Hitoshi Harada wrote: BTW, I think it is better to put together the discussion points we have done as general roadmap to complete window functions. It is not about the features for the next release but is the complete tasks. Where to go? Wiki, or my design docs? That's up to you, really. I like your design docs page, but you're free to use the Wiki as well, of course. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Conflict resolution in Multimaster replication(Postgres-R)
Thank you very much Robert and Markus. What kind of replication are you interested in? Markus: It looks like the hybrid approach used by Postgres-R(as described in that paper) is good. Thanks, Srinivas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] hash index improving v3
On Thu, Sep 4, 2008 at 10:06 AM, Simon Riggs [EMAIL PROTECTED] wrote: You don't give the text of the query used to do these performance tests, so I can't validate your test results. The attachment is the code to generate the text. Just $g++ my-permu-code.cpp $./a.out /tmp/words -- Best Regards, Xiao Meng DKERC, Harbin Institute of Technology, China Gtalk: [EMAIL PROTECTED] MSN: [EMAIL PROTECTED] http://xiaomeng.yo2.cn //-- // // // FILE NAME : X937out.cpp // // DESCRIPTIVE NAME: Provides X937 Output Plug-in APIs // // AUTHOR : Srinivasan MS // // MODULE TYPE : C++ Source File // //-- // #includeiostream using namespace std; long long totCount ; // this function finds the other // permutations by rotating the string // in circular manner void circularPrint(char *str) { int len = strlen(str); char *mystr = new char [len +1]; int i,j; for(i=0; i len; i++ ) { for(j=0;jlen; j++ ) mystr[j] = str[(i+j)%len]; mystr[j] =0; totCount++; // comment the line below to supress the string output cout mystr endl; } delete []mystr; return ; } void permutation(char prefix[],char suffix[]) { int length=strlen(suffix); int len = strlen(prefix) ; int i=0, j=0; char *myprefix = new char [len]; if(len ) { strncpy(myprefix, prefix, len - 1 ); myprefix[len-1]= 0; for(i=0; i length ; i++) { char *mysuffix = new char [length+2]; mysuffix[0] = prefix[len-1]; // rotate the current append and prepare append for next for(j=0; j length ; j++ ) { mysuffix[j+1] = suffix[(i+j)%length]; } mysuffix [j+1]= 0; permutation(myprefix, mysuffix); delete []mysuffix; } } else { // found permutation, now find other // permutations by rotating the string circularPrint(suffix); } delete []myprefix; } int main() { char prefix[]=123456789a; // first N-1 characters of the string of length N char suffix[]=0; // last character of the string time_t t1, t2; // find permutation permutation(prefix, suffix); return 0; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] xml2 vs XMLFunctions
Hi, I am writing you this mail, because I am currently using xml2 functionality in PostgreSQL 8.3.x and want to substitute it by the newer API as mentioned here: From PostgreSQL 8.3 on, there is XML-related functionality based on the SQL/XML standard in the core server. That functionality covers XML syntax checking and XPath queries, which is what this module does, and more, but the API is not at all compatible. It is planned that this module will be removed in PostgreSQL 8.4 in favor of the newer standard API, so you are encouraged to try converting your applications. If you find that some of the functionality of this module is not available in an adequate form with the newer API, please explain your issue to pgsql-hackers@postgresql.org so that the deficiency can be addressed. The only function of xml2 I ever used was xpath_bool(document,query) returns bool where document represents a table column. How can I do this with the new API ? I thought that 9.14.2. Processing XML: xpath(xpath, xml[, nsarray]) might be the right function, but I don't know how to specify the column where the xml is stored. Can you please give me an example ? I only have to implement xpath functions that return Boolean. I'd like to query - SELECT foo FROM bar AS x WHERE xpath('xpath', x.xmlcolumn [,nsarray]) Cheers, Tobias
Re: [HACKERS] [PATCH] Cleanup of GUC units code
Hannu Krosing escribió: On Wed, 2008-09-03 at 20:01 -0400, Alvaro Herrera wrote: Yes there is --- it's the SI. http://en.wikipedia.org/wiki/SI#SI_writing_style I don't know about it being evil and punishment, but it's wrong. SI defines decimal-based prefixes, where k = kilo = 1000, so our current conf use is also wrong. Actually, this has been a moving target. For a certain length of time, some standards did accept that k meant 1024 in computing context; see http://en.wikipedia.org/wiki/Binary_prefix So we're not _absolutely_ wrong here; at least not until KiB are more widely accepted and kB more widely refused to mean 1024 bytes. The relevant standard has been published just this year by ISO. http://en.wikipedia.org/wiki/ISO/IEC_8#Binary_prefixes So this is new territory, whereas case-sensitivity of prefixes and unit abbreviations has existed for decades. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Thu, Sep 04, 2008 at 01:26:44AM +0300, Hannu Krosing wrote: So Andrews opinion was that Mb (meaning Mbit) is different from MB (for megabyte) and that if someone thinks that we define shared buffers in megabits can get confused and order wrong kind of network card ? I know it's fun to point and laugh instead of giving an argument, but the above is not what I said. What I said is that there is a technical difference between at least some of these units, and one that is relevant in some contexts where we have good reason to believe Postgres is used. So it seems to me that there is at least a _prima facie_ reason in favour of making case-based decisions. Your argument against that appears to be, Well, people can be sloppy. Alvaro's suggestion seems to me to be a better one. It is customary, in servers with large complicated configuration systems, for the server to come with a tool that validates the configuration file before you try to load it. Postfix does this; apache does it; so does BIND. Heck, even NSD (which is way less configurable than BIND) does this. Offering such a tool provides considerable more benefit than the questionable one of allowing people to type whatever they want into the configuration file and suppose that the server will by magic know what they meant. I can understand Alvaros stance more readily - if we have irrational constraints on what can go into conf file, and people wont listen to reason Extending your current reasoning, it's irrational that all the names of the parameters have to be spelled correctly. Why can't we just accept log_statement_duration_min? It's _obvious_ that it's the same thing as log_min_duration_statement! It's silly to expect that harried administrators have to spell these options correctly. Why can't we parse all the file, separating each label by _. Then if any arrangements of those labels matches a real configuration parameter, select that one as the thing to match and proceed from there? A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Conflict resolution in Multimaster replication(Postgres-R)
Hello Srinivas, M2Y wrote: Markus: It looks like the hybrid approach used by Postgres-R(as described in that paper) is good. Well, yeah. That's why am working on it ;-) You are very welcome to download the patch and dig into the sources. See www.postgres-r.org for more information. To answer your original question in more details: Suppose there are two sites in the group, lets say, A and B and are managing a database D. Two transactions TA and TB started in sites A and B respectively, at nearly same time, wanted to update same row of a table in the database. As, no locking structures and other concurrency handling structures are replicated each will go ahead and do the modifications in their corresponding databases and sends the writeset. Correct so far. Note that both transactions might have applied changes, but they have not committed, yet. In eager mode we rely on the Group Communication System to deliver these two changesets [1] in the same order on both nodes. Let's say both receive TA's changeset first, then TB's. The backend which processed TA on node A can commit, because its changes don't conflict with anything else. The changeset of TB is forwarded to a helper backend, which tries to apply its changes. But the helper backend detects the conflict against TA and aborts (because it knows TA takes precedence on all other nodes as well). On node B, the backend which processed TB has to wait with its commit, because another changeset, namely TA's came in first. For that changeset a helper backend is started as well, which applies the changes of TA. During application of changes, that helper backend detects a conflict against the (yet uncommitted) changes of TB. As it knows its transaction TA takes precedence over TB (on all other nodes as well), it tells TB to abort and continues applying its own changes. I hope that was an understandable explanation. Regards Markus Wanner [1]: In the original Postgres-R paper, these are called writesets. But in my implementation, I've altered its meaning somewhat. Because of that (and because I admittedly like changeset better), I've decided to call them changesets now... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Debugging methods
Hello, I am a beginner to Postgres and I am going through code. I would like to know the debugging methods used in development. Some of my requirements are; for a given query, how parse structures are created in pg_parse_query, how they are analyzed and rewritten in pg_analyze_and_rewrite and how the final plan is created in pg_plan_queries. I will go through code but I would like to know any debugging methods available to understand what happens for a given query. I have searched in the net and I am unable to find them. Sorry if it is available somewhere and I am asking again. Thanks, Srinivas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Debugging methods
Hi, M2Y wrote: I am a beginner to Postgres and I am going through code. I would like to know the debugging methods used in development. Try ./configure with '--enable-debug' and '--enable-cassert', as outlined in the developer's FAQ [1], where you certainly find more information as well. Then run the postmaster with '-A1 -d5' Regards Markus Wanner [1]: http://wiki.postgresql.org/wiki/Developer_FAQ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Sep 4, 2008, at 6:29 AM, Andrew Sullivan wrote: On Thu, Sep 04, 2008 at 01:26:44AM +0300, Hannu Krosing wrote: So Andrews opinion was that Mb (meaning Mbit) is different from MB (for megabyte) and that if someone thinks that we define shared buffers in megabits can get confused and order wrong kind of network card ? I know it's fun to point and laugh instead of giving an argument, but the above is not what I said. What I said is that there is a technical difference between at least some of these units, and one that is relevant in some contexts where we have good reason to believe Postgres is used. So it seems to me that there is at least a _prima facie_ reason in favour of making case-based decisions. Your argument against that appears to be, Well, people can be sloppy. Settings in postgresql.conf are currently case-insensitive. Except for the units. Alvaro's suggestion seems to me to be a better one. It is customary, in servers with large complicated configuration systems, for the server to come with a tool that validates the configuration file before you try to load it. Postfix does this; apache does it; so does BIND. Heck, even NSD (which is way less configurable than BIND) does this. Offering such a tool provides considerable more benefit than the questionable one of allowing people to type whatever they want into the configuration file and suppose that the server will by magic know what they meant. How would such a tool cope with, for example, shared_buffers being set to one eighth the size the DBA intended, due to their use of Mb rather than MB? Both of which are perfectly valid units to use to set shared buffers, even though we only support one right now. If the answer to that is something along the lines of we don't support megaabits for shared_buffers, and never will because nobody in their right mind would ever intend to use megabits to set their shared buffer size... that's a useful datapoint when it comes to designing for usability. Cheers, Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] GUC source file and line number]
Greg Smith [EMAIL PROTECTED] wrote: name | Recommended | Current | Min | Default | Max -+-+-+---+-+- wal_buffers | 1024kB | 64kB| 32 kB | 64 kB | 2048 MB Personally, I would take the Min, Default, and Max to mean what Greg intends; it's the Current one that gives me pause. The current value of this connection? The value that a new connection will currently get? The value which new connections will get after a reload with the current conf file? The value which new connections will get after a restart with the current conf file? I can understand how someone would take one of these four values to be what is meant by Default, though. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Thu, 2008-09-04 at 09:29 -0400, Andrew Sullivan wrote: On Thu, Sep 04, 2008 at 01:26:44AM +0300, Hannu Krosing wrote: So Andrews opinion was that Mb (meaning Mbit) is different from MB (for megabyte) and that if someone thinks that we define shared buffers in megabits can get confused and order wrong kind of network card ? I know it's fun to point and laugh instead of giving an argument, but the above is not what I said. What I said is that there is a technical difference between at least some of these units, and one that is relevant in some contexts where we have good reason to believe Postgres is used. So it seems to me that there is at least a _prima facie_ reason in favour of making case-based decisions. Your argument against that appears to be, Well, people can be sloppy. Alvaro's suggestion seems to me to be a better one. Agreed. maybe this can even be implemented as a special switch to postmaster (maybe -n or --dry-run, similar to make), not a separate command. I can understand Alvaros stance more readily - if we have irrational constraints on what can go into conf file, and people wont listen to reason Extending your current reasoning, it's irrational that all the names of the parameters have to be spelled correctly. It would be irrational to allow all letters in parameter names to be case-insensitive, except 'k' which has to be lowercase ;) The main point of confusion comes from not accepting KB and this bites you when you go down from MB, with reasoning like ok, it seems that units are in uppercase, so let's change 1MB to 768KB and see what happens - Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] StartupCLOG
Simon Riggs [EMAIL PROTECTED] writes: I notice that StartupCLOG zeroes out entries later than the nextxid when we complete recovery in StartupXLOG, reason given is safety in case we crash. ISTM that we should also do that whenever we see a Shutdown Checkpoint in WAL, since that can be caused by a shutdown immediate, shutdown abort or crash. Er, what? The definition of a crash is the *lack* of a shutdown checkpoint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Thu, Sep 04, 2008 at 07:01:18AM -0700, Steve Atkins wrote: Settings in postgresql.conf are currently case-insensitive. Except for the units. And, of course, filenames when you are using a case-sensitive filesystem. Because these are things that are defined by some convention other than the ones the PGDG made up. Since units fall into that category, it seems to me that we're stuck with using external conventions. one right now. If the answer to that is something along the lines of we don't support megaabits for shared_buffers, and never will because nobody in their right mind would ever intend to use megabits to set their shared buffer size... that's a useful datapoint when it comes to designing for usability. And you are going to establish this worldwide convention on what someone in right mind would do how, exactly? For instance, I think nobody in right mind would use KB to mean kilobytes. I suppose you could get a random sample of all current Postgres users to decide what makes sense, but then you'd have the problem of knowing whether you had a random sample, since the population isn't obviously identifiable. Or, we could just stick with the convention that we already have, and write a tool that captures this an other issues. Maybe even one that could later form the basis for an automatic tuning advisor, as well. The problem with appeals to common sense always turns out to be that different people's common sense leads them to different conclusions. (We had a devastating government in Ontario some years ago that claimed to be doing things that were just common sense; the Province is still cleaning up the mess.) A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] StartupCLOG
On Thu, 2008-09-04 at 11:12 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: I notice that StartupCLOG zeroes out entries later than the nextxid when we complete recovery in StartupXLOG, reason given is safety in case we crash. ISTM that we should also do that whenever we see a Shutdown Checkpoint in WAL, since that can be caused by a shutdown immediate, shutdown abort or crash. Er, what? The definition of a crash is the *lack* of a shutdown checkpoint. Yes, but that's not what I'm saying. I was thinking about what happens when you are performing a PITR using log records that contain a crash/recovery/shutdown checkpoint sequence. I take it there's no problem there? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Debugging methods
M2Y [EMAIL PROTECTED] writes: I am a beginner to Postgres and I am going through code. I would like to know the debugging methods used in development. Some of my requirements are; for a given query, how parse structures are created in pg_parse_query, how they are analyzed and rewritten in pg_analyze_and_rewrite and how the final plan is created in pg_plan_queries. What I tend to do when trying to debug those areas is to set breakpoints at interesting places with gdb, and then use commands like call pprint(node_pointer) to dump the contents of specific parse or plan trees to the postmaster log. The reason that outfuncs.c supports so many node types (many that can't ever appear in stored rules) is exactly to make it useful for examining internal data structures this way. Another possibility is to turn on debug_print_plan and so on, but those settings only show you the finished results of parsing or planning, which isn't real helpful for understanding how the code gets from point A to point B. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] StartupCLOG
Simon Riggs [EMAIL PROTECTED] writes: I was thinking about what happens when you are performing a PITR using log records that contain a crash/recovery/shutdown checkpoint sequence. I take it there's no problem there? I don't really see one. I believe the reason for the StartupCLOG action is just to make sure that clog doesn't claim that any transactions are committed that weren't committed according to the WAL, or more precisely by the portion of WAL we chose to read. Consider PITR stopping short of the actual WAL end: it would clearly be possible that the current page of clog says that some future transactions are committed, but in our new database history we don't want them to be so. I think that the code is also trying to guard against a similar situation in a crash where WAL has been damaged and can't be read all the way to the end. Since the PITR slave isn't going to make any changes to clog in the first place that it isn't told to by WAL, it's hard to see how any divergence would arise. It could diverge when the slave stops slaving and goes live, but at that point it's going to do StartupCLOG itself. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] StartupCLOG
On Thu, 2008-09-04 at 12:18 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: I was thinking about what happens when you are performing a PITR using log records that contain a crash/recovery/shutdown checkpoint sequence. I take it there's no problem there? I don't really see one. OK, cool. I'm just trying to shake out all the possible problems, so sorry if this one was a false positive. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending grant insert on tables to sequences
On Wed, Sep 3, 2008 at 7:03 PM, Tom Lane [EMAIL PROTECTED] wrote: In short, this patch isn't much more ready to commit than it was in the last fest. Just for the record, i put this updated patch just because there were an entry for Extending grant insert on tables to sequences for this Commit Fest without being an updated patch -- regards, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. (593) 87171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Column level privileges was:(Re: [HACKERS] Extending grant insert on tables to sequences)
Jaime, * Jaime Casanova ([EMAIL PROTECTED]) wrote: On 7/25/08, Stephen Frost [EMAIL PROTECTED] wrote: Yes, I'm working on it hi, any work on it? may i help? If you look at the commitfest, I've posted my WIP so far there. Most of the grammer, parser, and catalog changes are there. There's a couple of bugs in that code that I'm working to run down but otherwise I think it's pretty good. I do need to add in the dependency tracking as well though, and that's what I'm planning to work on next. A piece which can be broken off pretty easily is adding support to track the columns used through to the executor so we can check the permissions in the right place. You should review Tom's #2 comment here: http://archives.postgresql.org/pgsql-patches/2008-05/msg00111.php Let me know if you'll be able to work on this or not. If not then I'll get to it after I'm happy with the other pieces of the patch. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [PATCHES] hash index improving v3
On Thu, 2008-09-04 at 21:03 +0800, Xiao Meng wrote: On Thu, Sep 4, 2008 at 10:06 AM, Simon Riggs [EMAIL PROTECTED] wrote: You don't give the text of the query used to do these performance tests, so I can't validate your test results. The attachment is the code to generate the text. Just $g++ my-permu-code.cpp $./a.out /tmp/words That generates the data, fine. What about the test query? You say you are running this command line pgbench -U postgres -n -f /tmp/query.sql dict Where is query.sql? It doesn't seem a big ask to be shown the SQL statement that is being executed, so we understand what is going on. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Need more reviewers!
Hackers, We currently have 38 patches pending, and only nine people reviewing them. At this rate, the September commitfest will take three months. If you are a postgresql hacker at all, or even want to be one, we need your help reviewing patches! There are several easy patches in the list, so I can assign them to beginners. Please volunteer now! -- --Josh Josh Berkus PostgreSQL San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need more reviewers!
On Thu, Sep 4, 2008 at 1:45 PM, Josh Berkus [EMAIL PROTECTED] wrote: We currently have 38 patches pending, and only nine people reviewing them. At this rate, the September commitfest will take three months. I'll push forward on reviewing and testing Xiao's hash index improvements for inclusion into core. Though, someone will still need to review my stuff. -- Jonah H. Harris, Senior DBA myYearbook.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need more reviewers!
Jonah H. Harris [EMAIL PROTECTED] writes: I'll push forward on reviewing and testing Xiao's hash index improvements for inclusion into core. Though, someone will still need to review my stuff. I think what the hash index patch really needs is some performance testing. I'm willing to take responsibility for the code being okay or not, but I haven't got any production-grade hardware to do realistic performance tests on. I'd like to see a few more scenarios tested than the one provided so far: in particular, wide vs narrow index keys and good vs bad key distributions. It strikes me that there's an intermediate posture that the patch doesn't provide: namely, store *both* hash code and original index key in each index entry. This would make the index larger rather than smaller, but you could still do the intra-page sort by hash code, which I think is likely a big win. In exchange for the bigger index you'd avoid fruitless trips to the heap for chance hashcode matches. So it seems far from clear to me that the hash-code-only solution is necessarily the best. If anyone is willing to do comparative performance testing, I'll volunteer to make up two variant patches that do it both ways and are otherwise equivalent. (I wouldn't be too surprised if testing shows that each way could be a win depending on the situation. In that case we could think about how to provide a switch to let users pick the method. But for the moment let's just test two separate patches.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Window functions patch v04 for the September commit fest
Heikki Linnakangas wrote: I'll review the parser/planner changes from the current patch. Looks pretty sane to me. Few issues: Is it always OK to share a window between two separate window function invocations, if they both happen to have identical OVER clause? It seems OK for stable functions, but I'm not sure that's correct for expressions involving volatile functions. I wonder if the SQL spec has anything to say about that. As you noted in the comments, the planner could be a lot smarter about avoiding sorts. Currently you just always put a sort node below each Window, and another on top of them if there's an ORDER BY. There's obviously a lot of room for improvement there. This line: result_plan-targetlist = preprocess_window(tlist, result_plan); in planner.c, won't work if result_plan is one that can't do projection. A few screenfuls above that call, there's this: /* * If the top-level plan node is one that cannot do expression * evaluation, we must insert a Result node to project the * desired tlist. */ if (!is_projection_capable_plan(result_plan)) { result_plan = (Plan *) make_result(root, sub_tlist, NULL, result_plan); } else { /* * Otherwise, just replace the subplan's flat tlist with * the desired tlist. */ result_plan-targetlist = sub_tlist; } which is what you need to do with the preprocess_window call as well. I think this is caused by that: postgres=# explain SELECT row_number() OVER (ORDER BY id*10) FROM (SELECT * FROM foo UNION ALL SELECT * FROM foo) sq; ERROR: bogus varattno for OUTER var: 2 And then there's these: postgres=# explain SELECT * FROm foo WHERE row_number() OVER (ORDER BY id) 10; ERROR: winaggref found in non-Window plan node postgres=# explain UPDATE foo SET id = 10 RETURNING (ROW_NUMBER() OVER (ORDER BY random())) ; ERROR: winaggref found in non-Window plan node postgres=# explain SELECT row_number() OVER (ORDER BY (1)) FROm foo; ERROR: ORDER/GROUP BY expression not found in targetlist postgres=# explain SELECT row_number() OVER (ORDER BY length('foo')) FROM foo; ERROR: could not find pathkey item to sort -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need more reviewers!
On Thu, Sep 04, 2008 at 02:01:18PM -0400, Tom Lane wrote: Jonah H. Harris [EMAIL PROTECTED] writes: I'll push forward on reviewing and testing Xiao's hash index improvements for inclusion into core. Though, someone will still need to review my stuff. I think what the hash index patch really needs is some performance testing. I'm willing to take responsibility for the code being okay or not, but I haven't got any production-grade hardware to do realistic performance tests on. I'd like to see a few more scenarios tested than the one provided so far: in particular, wide vs narrow index keys and good vs bad key distributions. Tom, Do you mean good vs. bad key distributions with respect to hash value collisions? It strikes me that there's an intermediate posture that the patch doesn't provide: namely, store *both* hash code and original index key in each index entry. This would make the index larger rather than smaller, but you could still do the intra-page sort by hash code, which I think is likely a big win. In exchange for the bigger index you'd avoid fruitless trips to the heap for chance hashcode matches. So it seems far from clear to me that the hash-code-only solution is necessarily the best. Currently, since a trip to the heap is required to validate any tuple even if the exact value is contained in the index, it does not seem like it would be a win to store the value in both places. One of the big improvements is the space efficiency of the index obtained by just storing the hash value. I think that increasing the number of hash bits stored would provide more bang-for-the-buck than storing the exact value. One easy way to provide this functionality without increasing the storage requirements is to take advantage of the knowledge of the bucket number to increase the number of bit, i.e. at 256 buckets, remove the first 8-bits of the hash and add 8-bits to provide a 40-bit hash in the same storage. At 64k buckets, you can now store a 48-bit hash in the same space and at 16m buckets, you can store a 56-bit hash in the original 32-bit space. So as the number of buckets increases, the number of hash bits increases as well as the specificity of the resulting hash thereby reducing the need to visit the heap tuple store. Another idea, takes advantage of smaller bucket size (I think of them as sub-buckets) to add an additional 8-bits of hash value at the 32m or 64m buckets by looking up the hash in one of 2^n sub-buckets where n = 64 or 128 or even 256. This has the advantage of eliminating the binary lookup and insertion within the bucket page. Again, this will need to be tested. If anyone is willing to do comparative performance testing, I'll volunteer to make up two variant patches that do it both ways and are otherwise equivalent. I am in the boat with you, in that I do not have database systems with enough hardware to perform realistic testing. (I wouldn't be too surprised if testing shows that each way could be a win depending on the situation. In that case we could think about how to provide a switch to let users pick the method. But for the moment let's just test two separate patches.) I think, as of yet un-tested, that where storing both would be of value when we do not have to visit the heap for visibility information or when using the space to store more hash bits would be more effective. Cheers, Ken regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need more reviewers!
Kenneth Marshall [EMAIL PROTECTED] writes: On Thu, Sep 04, 2008 at 02:01:18PM -0400, Tom Lane wrote: I think what the hash index patch really needs is some performance testing. I'm willing to take responsibility for the code being okay or not, but I haven't got any production-grade hardware to do realistic performance tests on. I'd like to see a few more scenarios tested than the one provided so far: in particular, wide vs narrow index keys and good vs bad key distributions. Do you mean good vs. bad key distributions with respect to hash value collisions? Right, I'm just concerned about how badly it degrades with hash value collisions. Those will result in wasted trips to the heap if we omit the original key from the index. AFAICS that's the only downside of doing so; but we should have an idea of how bad it could get before committing to doing this. Currently, since a trip to the heap is required to validate any tuple even if the exact value is contained in the index, it does not seem like it would be a win to store the value in both places. The point is that currently you *don't* need a trip to the heap if the key doesn't match, even if it has the same hash code. I think that increasing the number of hash bits stored would provide more bang-for-the-buck than storing the exact value. Maybe, but that would require an extremely invasive patch that breaks existing user-defined datatypes. You can't just magically get more hash bits from someplace, you need datatype-specific hash functions that will provide more than 32 bits. There's going to have to be a LOT of evidence in support of the value of doing that before I'll buy in. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] GUC source file and line number]
On Thu, 4 Sep 2008, Kevin Grittner wrote: Personally, I would take the Min, Default, and Max to mean what Greg intends; it's the Current one that gives me pause. That's the output of current_setting(name) which shows what it is right now; no more, no less. See http://www.postgresql.org/docs/current/interactive/sql-show.html and http://www.postgresql.org/docs/current/interactive/functions-admin.html That shows the parameter as the admin/usr set it, whereas the setting column in pg_settings displays in terms of the internal units. Example session showing how that all fits together: pg=# select current_setting('work_mem'); current_setting - 1GB (1 row) pg=# set work_mem='256MB'; SET pg=# select current_setting('work_mem'); current_setting - 256MB (1 row) pg=# select setting,unit from pg_settings where name='work_mem'; setting | unit -+-- 262144 | kB (1 row) Since the word current isn't actually in the patch anywhere, and only appears in that little sample usage snippet I provided, whether or not it's a good name for that doesn't impact the patch itself. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need more reviewers!
On Thu, 2008-09-04 at 14:01 -0400, Tom Lane wrote: If anyone is willing to do comparative performance testing, I'll volunteer to make up two variant patches that do it both ways and are otherwise equivalent. Why not do both, set via a reloption? We can then set the default to whichever wins in general testing. There will always be cases where one or the other is a winner, so having both will be useful. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need more reviewers!
On Thu, 2008-09-04 at 10:45 -0700, Josh Berkus wrote: We currently have 38 patches pending, and only nine people reviewing them. At this rate, the September commitfest will take three months. If you are a postgresql hacker at all, or even want to be one, we need your help reviewing patches! There are several easy patches in the list, so I can assign them to beginners. Please volunteer now! Everybody is stuck in I'm not good enough to do a full review. They're right (myself included), so that just means we're organising it wrongly. We can't expect to grow more supermen, but we probably can do more teamwork and delegation. I think this should be organised with different kinds of reviewer: * submission review - is patch in standard form, does it apply, does it have reasonable tests, docs etc.. Little or no knowledge of patch required to do that. We have at least 50 people can do that. * usability review - read what patch does. Do we want that? Do we already have it? Does it follow SQL spec? Are there dangers? Have all the bases been covered? We have 100s of people who can do that. * feature test - does feature work as advertised? * performance review - does the patch slow down simple tests? Does it speed things up like it says? Does it slow down other things? We have at least 100 people who can do that. * coding review - does it follow standard code guidelines? Are there portability issues? Will it work on Windows/BSD etc? Are there sufficient comments? * code review - does it do what it says, correctly? * architecture review - is everything done in a way that fits together coherently with other features/modules? are there interdependencies than can cause problems? * review review - did the reviewer cover all the things that kind of reviewer is supposed to do? If we split up things like this, we'll get a better response. Why not go to -perform for performance testers, general for usability and feature testers? If we encourage different types of review, more people will be willing to step up to doing a small amount for the project. Lots of eyeballs, not one good pair of eyes. Also, why has the CommitFest page been hidden? Whats the point of that? We probably need to offer some training and/or orientation for prospective reviewers, so people understand what is expected of them, how to do it and who to tell when they've done it. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page layout footprint
Zdenek Kotala wrote: The original proposal (http://archives.postgresql.org/message-id/[EMAIL PROTECTED]) contains two parts. First part is implementation of --footprint cmd line switch which shows you page layout structures footprint. It is useful for development (mostly for in-place upgrade) and also for manual data recovery when you need to know exact structures. I'm afraid I still fail to see the usefulness of this. gdb knows how to deal with structs, and for manual data recovery you need so much more than the page header structure. And if you're working at such a low level, it's not that hard to calculate the offsets within the struct manually. BTW, this makes me wonder if it would be possible to use the upgrade-in-place machinery to convert a data directory from one architecture to another? Just a thought.. However, there is still --footprint switch which is useful and it is reason why I put it on wiki for review and feedback. The switch could also use in build farm for collecting footprints from build farm members. If we needed more information about the architectures, we could just collect the output of pg_controldata. But I think the configure logs already contains all the useful information. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need more reviewers!
Simon Riggs [EMAIL PROTECTED] writes: On Thu, 2008-09-04 at 14:01 -0400, Tom Lane wrote: If anyone is willing to do comparative performance testing, I'll volunteer to make up two variant patches that do it both ways and are otherwise equivalent. Why not do both, set via a reloption? That is exactly the code I'm unwilling to write until we find out if it's needed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prototype: In-place upgrade
The patch seems to be missing the new htup.c file. Zdenek Kotala wrote: Attached patch is prototype of in-place upgrade as was presented on PGCon this year. Main idea is to learn postgres to handle different version of page and tuple structures. 1) page - Patch contains new page API and all code access page through this API. Functions check page version and return correct data to caller. It is mostly complete now. Only ItemId flags need finish. 2) tuple - HeapTuple structure has been extended with t_ver attribute which contains page layout version and direct access to HeapTupleHeader is forbidden. It is possible now only through HeapTuple* functions (see htup.c). (HeapTupleHeader access still stays in a several functions like heap_form_tuple). This patch version still does not allow to read old database, but it shows how it should work. Main disadvantage of this approach is performance penalty. Please, let me know your opinion about this approach. Future work: 1) learn WAL to process different tuple structure version 2) tuple conversion to new version and put it into executor (ExecStoreTuple) 3) multiversion MaxItemSize constant -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] code coverage patch
Michelle Caisse wrote: I've attached a patch that allows the generation of code coverage statistics. To test it, apply the patch, then: autoconf ./configure --enable-coverage make make check (or execute any other application against the database to see the coverage of that app) make coverage make coverage_out make clean does not work for me; it doesn't remove the .gcda and .gcno files. Apparently the problem is that $(enable_coverage) is not defined, so that part of common.mk is not called. Note: one thing to keep in mind is directories like src/port. There are some .gcda and .gcno files in there too, but even if common.mk is fixed, they will not be cleaned because src/port does not seem to use common.mk. Another thing that I'm unsure about is the coverage_out target. It does work, but is it running the coverage target each time it is invoked? Because if so, it's removing all of ./coverage and creating it again ... is this the desired behavior? This patch is missing a installation.sgml patch, at a minimum. I think it would be useful to mention that we support gcov, and the make targets we have, in some other part of the documentation. I can't readily find a good spot, but I think a new file to be inserted in the internals chapter would be appropriate. Two hunks no longer apply, but that's OK because they were working around a problem that no longer exists. Other than the minor gripes above, the patch looks OK to me. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Verbosity of Function Return Type Checks
Volkan YAZICI wrote: Made callers pass related error message as a string parameter, and appended required details using errdetail(). Cool, thanks. I had a look and you had some of the expected vs. returned reversed. This patch should be OK. Amazingly, none of the regression tests need changing, which is a bit worrisome. I wasn't able to run the tests in contrib, I don't know why, and I have to go out now. I'll commit this tomorrow. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support Index: src/pl/plpgsql/src/pl_exec.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/pl/plpgsql/src/pl_exec.c,v retrieving revision 1.219 diff -c -p -r1.219 pl_exec.c *** src/pl/plpgsql/src/pl_exec.c 1 Sep 2008 22:30:33 - 1.219 --- src/pl/plpgsql/src/pl_exec.c 4 Sep 2008 22:11:46 - *** static Datum exec_simple_cast_value(Datu *** 188,194 Oid reqtype, int32 reqtypmod, bool isnull); static void exec_init_tuple_store(PLpgSQL_execstate *estate); ! static bool compatible_tupdesc(TupleDesc td1, TupleDesc td2); static void exec_set_found(PLpgSQL_execstate *estate, bool state); static void plpgsql_create_econtext(PLpgSQL_execstate *estate); static void free_var(PLpgSQL_var *var); --- 188,195 Oid reqtype, int32 reqtypmod, bool isnull); static void exec_init_tuple_store(PLpgSQL_execstate *estate); ! static void validate_tupdesc_compat(TupleDesc expected, TupleDesc returned, ! char *msg); static void exec_set_found(PLpgSQL_execstate *estate, bool state); static void plpgsql_create_econtext(PLpgSQL_execstate *estate); static void free_var(PLpgSQL_var *var); *** plpgsql_exec_function(PLpgSQL_function * *** 384,394 { case TYPEFUNC_COMPOSITE: /* got the expected result rowtype, now check it */ ! if (estate.rettupdesc == NULL || ! !compatible_tupdesc(estate.rettupdesc, tupdesc)) ! ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), ! errmsg(returned record type does not match expected record type))); break; case TYPEFUNC_RECORD: --- 385,392 { case TYPEFUNC_COMPOSITE: /* got the expected result rowtype, now check it */ ! validate_tupdesc_compat(tupdesc, estate.rettupdesc, ! returned record type does not match expected record type); break; case TYPEFUNC_RECORD: *** plpgsql_exec_trigger(PLpgSQL_function *f *** 705,715 rettup = NULL; else { ! if (!compatible_tupdesc(estate.rettupdesc, ! trigdata-tg_relation-rd_att)) ! ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), ! errmsg(returned tuple structure does not match table of trigger event))); /* Copy tuple to upper executor memory */ rettup = SPI_copytuple((HeapTuple) DatumGetPointer(estate.retval)); } --- 703,711 rettup = NULL; else { ! validate_tupdesc_compat(trigdata-tg_relation-rd_att, ! estate.rettupdesc, ! returned tuple structure does not match table of trigger event); /* Copy tuple to upper executor memory */ rettup = SPI_copytuple((HeapTuple) DatumGetPointer(estate.retval)); } *** exec_stmt_return_next(PLpgSQL_execstate *** 2199,2209 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg(record \%s\ is not assigned yet, rec-refname), ! errdetail(The tuple structure of a not-yet-assigned record is indeterminate.))); ! if (!compatible_tupdesc(tupdesc, rec-tupdesc)) ! ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), ! errmsg(wrong record type supplied in RETURN NEXT))); tuple = rec-tup; } break; --- 2195,2204 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg(record \%s\ is not assigned yet, rec-refname), ! errdetail(The tuple structure of a not-yet-assigned ! record is indeterminate.))); ! validate_tupdesc_compat(tupdesc, rec-tupdesc, ! wrong record type supplied in RETURN NEXT); tuple = rec-tup; } break; *** exec_stmt_return_query(PLpgSQL_execstate *** 2309,2318 stmt-params); } ! if (!compatible_tupdesc(estate-rettupdesc, portal-tupDesc)) ! ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), ! errmsg(structure of query does not match function result type))); while (true) { --- 2304,2311 stmt-params); } ! validate_tupdesc_compat(estate-rettupdesc, portal-tupDesc, ! structure of query does not match function result type); while (true) { *** exec_simple_check_plan(PLpgSQL_expr *exp *** 5145,5167 } /* ! * Check two tupledescs have
Re: [HACKERS] code coverage patch
Thanks, I'll take a look at these issues. -- Michelle Alvaro Herrera wrote: Michelle Caisse wrote: I've attached a patch that allows the generation of code coverage statistics. To test it, apply the patch, then: autoconf ./configure --enable-coverage make make check (or execute any other application against the database to see the coverage of that app) make coverage make coverage_out make clean does not work for me; it doesn't remove the .gcda and .gcno files. Apparently the problem is that $(enable_coverage) is not defined, so that part of common.mk is not called. Note: one thing to keep in mind is directories like src/port. There are some .gcda and .gcno files in there too, but even if common.mk is fixed, they will not be cleaned because src/port does not seem to use common.mk. Another thing that I'm unsure about is the coverage_out target. It does work, but is it running the coverage target each time it is invoked? Because if so, it's removing all of ./coverage and creating it again ... is this the desired behavior? This patch is missing a installation.sgml patch, at a minimum. I think it would be useful to mention that we support gcov, and the make targets we have, in some other part of the documentation. I can't readily find a good spot, but I think a new file to be inserted in the internals chapter would be appropriate. Two hunks no longer apply, but that's OK because they were working around a problem that no longer exists. Other than the minor gripes above, the patch looks OK to me. -- Michelle Caisse Sun Microsystems California, U.S. http://sun.com/postgresql
Re: [HACKERS] Need more reviewers!
On Thu, Sep 4, 2008 at 12:01 PM, Tom Lane [EMAIL PROTECTED] wrote: I think what the hash index patch really needs is some performance testing. I'm willing to take responsibility for the code being okay or not, but I haven't got any production-grade hardware to do realistic performance tests on. I'd like to see a few more scenarios tested than the one provided so far: in particular, wide vs narrow index keys and good vs bad key distributions. If anyone is willing to do comparative performance testing, I'll volunteer to make up two variant patches that do it both ways and are otherwise equivalent. I can happily through some hardware at this. Although production-grade is in the eye of the beholder... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need more reviewers!
Alex Hunsaker [EMAIL PROTECTED] writes: I can happily through some hardware at this. Although production-grade is in the eye of the beholder... I just posted a revised patch in the pgsql-patches thread. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Verbosity of Function Return Type Checks
Alvaro Herrera [EMAIL PROTECTED] writes: I wasn't able to run the tests in contrib, I don't know why, and I have to go out now. I'll commit this tomorrow. This is not ready to go: you've lost the ability to localize most of the error message strings. Also, char *msg should be const char *msg if you're going to pass literal constants to it, and this gives me the willies even though the passed-in strings are supposedly all fixed: errmsg(msg), Use errmsg(%s, msg), to be safe. Actually, the entire concept of varying the main message to suit the context exactly, while the detail messages are *not* changing, seems pretty bogus... Another problem with it is it's likely going to fail completely on dropped columns (which will have atttypid = 0). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump roles support
Tom Lane wrote: Some other review nitpicking: Thank you for your review. I really need all suggestions, since I never posted any patch to the community before. The next patch will emit the SET ROLE command in the generated dump, as you and Stephen said. This will fit in my workflow too, since mostly I need to restore using the same role as the dump. Regards, Laszlo Benedek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] SSL problems
After I disable SSL option in postgresql.conf the server is starting successfully. Please, advise. Jan-Peter Seifert wrote: Hello Andriy, the reply-to settings are a bit uncomfortable here. Your mail went only to me. But I'm not part of the developer or support team. It's strange that pg_ctl doesn't say anything else. Is there any system sniffer on FreeBSD like Process Monitor on Windows? I can only say that the docs worked for me (removed the password as described) on Ubuntu and Windows. I got complaints because of the rights on the certificates first. Does the server really start if SSL is deactivated in postgresql.conf again? Good luck, Peter Yes of cause I compiled with OpenSSL support (FreeBSD port has this option enabled by default). And I have all certificates with proper CA signature, rest of applications (Postfix, Apache, etc.) work with this certificates very well. And to make sure I ran the following command 'pg_config': $ pg_config BINDIR = /usr/local/bin DOCDIR = /usr/local/share/doc/postgresql INCLUDEDIR = /usr/local/include PKGINCLUDEDIR = /usr/local/include/postgresql INCLUDEDIR-SERVER = /usr/local/include/postgresql/server LIBDIR = /usr/local/lib PKGLIBDIR = /usr/local/lib/postgresql LOCALEDIR = /usr/local/share/locale MANDIR = /usr/local/man SHAREDIR = /usr/local/share/postgresql SYSCONFDIR = /usr/local/etc/postgresql PGXS = /usr/local/lib/postgresql/pgxs/src/makefiles/pgxs.mk CONFIGURE = '--with-libraries=/usr/local/lib' '--with-includes=/usr/local/include' '--enable-thread-safety' '--with-docdir=/usr/local/share/doc/postgresql' '--with-openssl' '--with-system-tzdata=/usr/share/zoneinfo' '--enable-integer-datetimes' '--enable-nls' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/info/' '--build=amd64-portbld-freebsd7.0' 'CC=cc' 'CFLAGS=-O2 -fno-strict-aliasing -pipe ' 'LDFLAGS= -pthread -rpath=/usr/local/lib' 'build_alias=amd64-portbld-freebsd7.0' CC = cc CPPFLAGS = -I/usr/local/include CFLAGS = -O2 -fno-strict-aliasing -pipe -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv CFLAGS_SL = -fPIC -DPIC LDFLAGS = -pthread -rpath=/usr/local/lib -L/usr/local/lib -Wl,-R'/usr/local/lib' LDFLAGS_SL = LIBS = -lpgport -lintl -lssl -lcrypto -lz -lreadline -lcrypt -lm VERSION = PostgreSQL 8.3.3 It should be something else. Andriy [EMAIL PROTECTED] wrote: Hi, Datum: Wed, 03 Sep 2008 08:43:29 -0400 Von: Andriy Bakay [EMAIL PROTECTED] An: [EMAIL PROTECTED], [EMAIL PROTECTED] Betreff: [ADMIN] SSL problems Hi Team, I have problems to setup SSL for PostgreSQL server. I did all the steps which described in the documentation (17.8. Secure TCP/IP Connections with SSL), but when I try to start the PostgreSQL server the pg_ctl gave me: could not start server. And nothing in the logs (I enabled all of them). I googled around but did not find much. My spec: FreeBSD 7.0-RELEASE-p3 amd64 PostgreSQL 8.3.3 (installed from ports): WITH_NLS=true WITHOUT_PAM=true WITHOUT_LDAP=true WITHOUT_MIT_KRB5=true WITHOUT_HEIMDAL_KRB5=true WITHOUT_OPTIMIZED_CFLAGS=true WITH_XML=true WITHOUT_TZDATA=true WITHOUT_DEBUG=true WITH_ICU=true WITH_INTDATE=true obviously configure hasn't been run with the option --with-openssl before compiling the binaries. With the PostgreSQL command pg_config you get the configure options that have been used for making the binaries - so you can make sure. It seems that you must recompile from sources. Are you sure you have openssl itself installed on your system? Maybe you have to generate a certificate as well. It has been a while since I had installed SSL-support successfully on windows and Linux. Peter -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need more reviewers!
On Fri, Sep 5, 2008 at 6:54 AM, Simon Riggs [EMAIL PROTECTED] wrote: On Thu, 2008-09-04 at 10:45 -0700, Josh Berkus wrote: Please volunteer now! Everybody is stuck in I'm not good enough to do a full review. They're right (myself included), so that just means we're organising it wrongly. We can't expect to grow more supermen, but we probably can do more teamwork and delegation. As a first-time reviewer, I agree with Simon's comments, and I'd like to make the point that there's currently no written policy for how to review a patch. I let Josh know that I was interesting in joining this commitfest as a round robin reviewer, and he's assigned me a patch. Okay. What am I supposed to do now? I can certainly download the patch, test it, review the code, and write my thoughts to the list. I can then add a review link to the wiki page. Assuming I think the patch is acceptable, what then? Do I hand it off to somebody else for a full review/commit? How do I do that? etc. At the moment, for the review virgin, please volunteer now translates roughly as please elect to join an opaque and undocumented process which has until now been handled entirely by committers. That might have something to do with the low turnout. We have a (really useful) wiki page called Submitting a Patch. I think we need one called Reviewing a Patch. That way, instead of just an appeal to the masses to volunteer for $NEBULOUS_TASK, we can say something like Please volunteer to review patches. Doing an initial patch review is easy, please see our guide link to learn more. Cheers, BJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [Review] Tests citext casts by David Wheeler.
Hello all, Here is my review of the Test citext casts written by David Wheeler: http://archives.postgresql.org/message-id/[EMAIL PROTECTED] 1. The patch applies cleanly to the latest GIT repository. 2. The citext type installs, uninstalls, and re-installs cleanly. 3. The coding style is mostly consistent with the existing code. The only coding style difference I noticed was introduced in this patch: In the citext.sql.in file the following functions are created using the non-dollar quoting syntax: * regex_matches * regexp_replace * regexp_split_to_array * regexp_split_to table * strpos In the citext.sql.in file the following functions are created using the dollar quoting syntax: * replay * split_part * translate I do not have a strong preference either way and I do not even care if they are consistent. I am interested to see if there was a reason for using both syntaxes for these functions. 4. The regression tests successfully pass when PostgreSQL is built with --with-libxml. They fail when the PostgreSQL is built without --with-libxml. Since the default PostgreSQL configuration does not include --with-libxml and is not run-time detected when the libxml2 libraries are present on the system I would recommend adding an additional expected output file (citext_1.out) that covers the conditions when PostgreSQL is compiled without --with-libxml. As an experiment, I was able to add the citext_1.out and the regression tests passed with and without the --with-libxml option. Review of the additional regression tests show they provide coverage of the new casts and functions added with this patch. Overall I think the patch looks good. After reviewing the patch, I played with citext for an hour or so and I did not encounter any bugs or other surprises. Thanks, - Ryan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Review] Tests citext casts by David Wheeler.
On Sep 4, 2008, at 21:40, Ryan Bradetich wrote: Overall I think the patch looks good. After reviewing the patch, I played with citext for an hour or so and I did not encounter any bugs or other surprises. Thanks for the review, Ryan! Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [Review] pgbench duration option
On Tue, Aug 19, 2008 at 12:24 PM, ITAGAKI Takahiro [EMAIL PROTECTED] wrote: Ok, I rewrote the patch to use SIGALRM instead of gettimeofday. Hi Itagaki-san, Josh assigned your patch to me for an initial review. Here's what I have so far. The patch applies cleanly on the latest git HEAD, and compiles/installs/runs nicely with a fresh initdb on amd64 gentoo. I didn't notice any aberrations in the coding style. The -T option seems to work as advertised, and I wasn't able to detect any performance degradation (or a significant variation of any kind) using the -T option versus the -t option. Unfortunately, I don't have access to a Windows build environment, so I wasn't able to investigate the portability concerns raised in the email thread. I do have a few minor suggestions: * The error message given for -T = 0 is invalid number of duration(-T): %d. The grammar isn't quite right there. I would go with invalid run duration (-T): %d, or perhaps just invalid duration (-T): %d. * If the -T and -t options are supposed to be mutually incompatible, then there should be an error if the user tries to specify both options. Currently, if I specify both options, the -t option is ignored in favour of -T. An error along the lines of Specify either a number of transactions (-t) or a duration (-T), not both. would be nice. * It seems like the code to stop pgbench when the timer has expired has some unnecessary duplication. Currently it reads like this: if (duration 0) { if (timer_exceeded) { stop } } else if (st-cnt = nxacts) { stop } Wouldn't this be better written as: if ((duration 0 timer_exceeded) || st-cnt = nxacts) { stop } * The documentation should mention the new -T option in the following paragraph: In the first place, never believe any test that runs for only a few seconds. Increase the -t setting enough to make the run last at least a few minutes, so as to average out noise. Perhaps: In the first place, never believe any test that runs for only a few seconds. Use the -t or -T option to make the run last at least a few minutes, so as to average out noise. That's it for my initial review. I hope my comments are helpful. Cheers, BJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Review] pgbench duration option
Brendan Jurd [EMAIL PROTECTED] wrote: Josh assigned your patch to me for an initial review. Here's what I have so far. Thank your for reviewing! The -T option seems to work as advertised, and I wasn't able to detect any performance degradation (or a significant variation of any kind) using the -T option versus the -t option. Good news. Unfortunately, I don't have access to a Windows build environment, so I wasn't able to investigate the portability concerns raised in the email thread. I found I have a mistake in the usage of BOOL and BOOLEAN types. I'll fix it in the next patch. I do have a few minor suggestions: * The error message given for -T = 0 is invalid number of duration(-T): %d. The grammar isn't quite right there. I would go with invalid run duration (-T): %d, or perhaps just invalid duration (-T): %d. invalid duration (-T): %d looks good for me. * If the -T and -t options are supposed to be mutually incompatible, then there should be an error if the user tries to specify both options. Currently, if I specify both options, the -t option is ignored in favour of -T. An error along the lines of Specify either a number of transactions (-t) or a duration (-T), not both. would be nice. Ok, I'll add the error protection. * It seems like the code to stop pgbench when the timer has expired has some unnecessary duplication. Ah, I forgot to clean up the codes. I'll fix it. (There was another logic here in the first patch, but not needed in the version.) * The documentation should mention the new -T option in the following paragraph: In the first place, never believe any test that runs for only a few seconds. Use the -t or -T option to make the run last at least a few minutes, so as to average out noise. I'll do it. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers