Re: [HACKERS] psql variables
Tom Lane [EMAIL PROTECTED] writes: [ blinks... ] This is historical revisionism. Psql variables were invented to provide user-defined variables; it is the predefined ones that are a wart added to the mechanism, not vice versa. The historical origins of the feature are no excuse for its deficiencies. Again, you're letting the tail wag the dog. If we did that then any addition of a new built-in variable would risk breaking existing user scripts that happened to use that name as an ordinary variable. A name conflict with a newly-added variable is very likely to cause problems in any case, if the new variable does anything at all. For example, if your pre-7.4 used AUTOCOMMIT to mean something other than what 7.4 thinks it means, your script is probably broken. On second thought, there's another alternative. Rather than improving \set, we could invent a new mechanism for setting psql-internal variables, and leave the \set stuff to user-defined variables. That way we can define whatever semantics we'd like for the new command without hindering backward compatibility. We'd need to ensure that using \set to modify existing psql variable names still works, but that shouldn't be too difficult. One downside is that we'd now have *three* different sets of variables in psql alone (\set, \pset, and \sys_set or whatever it might be named). We might improve that by amalgamating \pset into \sys_set, because AFAICS there is no reason for the former to exist. Comments? -Neil ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] session persistent data for plperl
Adnrew, The attached tiny patch (not intended for application yet) provides a space for plperl functions to create and share session persistent data, which I should think would increase the utility of plperl. You want to really make PL/Perl more useful? Add an SPI interface, and work out the kinks in error-trapping via eval{}. This would be far more useful that session variables. Just a thought. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] Timestamps
Hi all, I am not sure if this should be sent to this mailing list. If i am wrong could someone please direct me to the correct one so I can subscribe there. I wanted to ask a simple question. Say I have a table with the timestamp field. What is the best way to say get all the records that were created say 2 hours before the query. One of the options would be to generate the timestamp in the correct format and then send a query in the format SELECT * from table where create_time generated_timestamp Is there a better way than this? Any help would be greatly appreciated, Regards, Slavisa ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] RFC: Security documentation
Alex, As such, I would like to see some documentation about securing the database at a data and application level. It would be nice to have some general guidelines, as well as being able to cite documentation when setting up a security policy for a database application. I'd be happy to participate in discussing security strategies for PostgreSQL databases; I do a bit of this for my clients though, not, I think, on the level of need you face. A lot can be done especially with 7.4 by manipulating permissions, obfuscating database objects through views, functions, and rules, and otherwise locking down database objects. For example, if the web server user's only access to a table is via a set-returing search function, and that user has no other permissions, a SQL injection attack isn't going to obtain anything other than an error. The problem with this approach, of course, is that large application developers generally like to make the database fairly passive and put all business security logic in the middleware. I do think it would be useful for them to realize that they are sacrificing a significant portion of their data security by doing so. On a machine/network level, all I really do is lock down port access to the database server, and make sure that the db server has no ports that point to unsecured networks (as well as the usual user/database/password restrictions).My general perspective is that if an attacker can gain unrestricted port access to the database, a break-in is only a matter of time -- if through nothing else than password-guessing attacks. I'm not sure such a paper is appropriate for the main docs. But it is definitely appropriate for TechDocs. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Timestamps
I wanted to ask a simple question. Say I have a table with the timestamp field. What is the best way to say get all the records that were created say 2 hours before the query. One of the options would be to generate the timestamp in the correct format and then send a query in the format SELECT * from table where create_time generated_timestamp Is there a better way than this? Sure is: SELECT * from table where create_time (CURRENT_TIMESTAMP - INTERVAL '2 hours'); Chris ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] psql tab completion USERSET vars
Tom Lane [EMAIL PROTECTED] writes: I'm not by any means wedded to the USERSET and possibly SUSET policy, but I would like to stop somewhere short of include everything. Any thoughts? Perhaps we could have two sets of variables: all the GUC vars (that can be displayed via SHOW), and a subset of those that can be set by the user. We could use the first set for tab-completion on SHOW and the second for tab completion on SET. -Neil ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] RFC: Very large scale postgres support
Alex, I find myself wondering what other people are doing with postgres that this doesn't seem to have come up. When one searches for postgres clustering on google, they will find lots of HA products. However, nobody seems to be attempting to create very high throughput clusters. Have you checked out Clusgres from Linux Labs? One of my clients will be testing this application soon. While we haven't run into transaction throughput limits, a couple of my client's seem to be running in x86's limits on very large queries, particularly on vendor hardware. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] RFC: Very large scale postgres support
That's what I said, and what I meant. Ten billion transactions equates to 115,740 transactions per second. Have you tried to look at the scientific comunity? CERN has setups that produce such large amounts of data - try searching google for http://www.google.com/search?q=cern+event+database+postgresql or even http://www.google.com/search?q=cern+event+database You might find some inspiration there! Bye, Chris. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Advice regarding configuration parameters
Tom Lane wrote: Given all the work Peter put into GUC (for very good reasons), I was a tad astonished to read him proposing to develop a non-GUC mechanism for configuring PLs. I for one was a tad astonished to read that there is already support for adding variables at run-time, given that we previously rejected that notion. But the namespace idea for variables added by external modules sounds interesting. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] psql variables
Neil Conway wrote: Recently, I was surprised to learn that psql variables are case sensitive. like shell variables Furthermore, there is no error when one attempts to '\set' a non-existent variable Well, how are you going to set a new variable if not this way? One possible justification for this behavior is that it allows a simple naming convention to distinguish user-defined variables from psql-internal variables. Is that the case? Yes, upper-case names are reserved. This is documented. While we're on the subject, there are some other design choices in this area that seem a bit unfortunate. For example, we don't error out on references to undefined variables like shell variables Also, the user can go ahead and overwrite the value of built-in variables like HOST and PORT like shell variables ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] session persistent data for plperl
Josh Berkus wrote: Adnrew, Jsoh :-) You want to really make PL/Perl more useful? Add an SPI interface, and work out the kinks in error-trapping via eval{}. This would be far more useful that session variables. Just a thought. I don't intend to stop there. I admit that the gain in value from my patch is limited, but I think it is greater than 0, and for a 2 line change a cheap win. When I have bedded down some other things I am working on (e.g. logging enhancements, dollar quoting) I intend to look further at some of the Perl issues, both on the server side and the client side. As a longtime and fond Perl user, I don't like seeing Perl as the poor cousin PL in our community. cheers andrew ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Transaction aborts on syntax error.
Tom Lane [EMAIL PROTECTED] writes: What it comes down to is that a lot of code in the backend assumes that transaction abort can be relied on to do any post-elog cleanup needed, such as releasing locks or reclaiming leaked memory. I don't think we can afford to give up that assumption; the costs in code complexity and instability would be horrific. What we have to do is generalize the abort cleanup code so it can handle partial rollbacks as well as complete ones. Thus nested transactions is really a shorthand for this problem of post-error cleanup. So you picture the backend automatically introducing a mini-nested-transaction for every request and automatically rolling that back on any error. So the application or user wouldn't have to do anything to continue processing ignoring the error? -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] RFC: Very large scale postgres support
Alex, I agree that this is something that is worth spending time on. This resembles the Oracle RAC (Real Application Cluster). While other people may feel that the amount of data is unreasonable I have a similar problem that will only be solved using such a solution. In regards to how your database is designed? Who cares? This is an RFC for a general discussion on how to design this level of functionality into Postgres. Ultimately any solution would work without regard to the insert, updates, or deletes being executed. Alex, I think as a first step we should start coming up with a feature list of what would be necessary to support this level of functionality. From that point we could then identify efforts that are currently ongoing on Postgres development that we could help out on as well as those items that would need to be handled directly. I am very interested in going forth with this discussion and believe that I would be able to have the company I work for put forward resources (i.e. people or money) on developing the solution if we can come up with a workable plan. Josh, thanks for the heads up on Clusgres, I will take a look and see how that fits. Thanks, Keith -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex J. Avriette Sent: Saturday, February 07, 2004 12:29 PM To: [EMAIL PROTECTED] Subject: [HACKERS] RFC: Very large scale postgres support Recently I was tasked with creating a distribution system for postgres nodes here at work. This would allow us to simply bring up a new box, push postgres to it, and have a new database. At the same time, we have started to approach the limits of what we can do with postgres on one machine. Our platform presently is the HP DL380. It is a reasonably fast machine, but in order to eke more performance out of postgres, we are going to have to upgrade the hardware substantially. So the subject came up, wouldn't it be nice if, with replication and proxies, we could create postgres clusters? When we need more throughput, to just put a new box in the cluster, dist a psotgres instance to it, and tell the proxy about it. This is a very attractive idea for us, from a scalability standpoint. It means that we don't have to buy $300,000 servers when we max out our 2- or 4- cpu machines (in the past, I would have suggested a Sun V880 for this database, but we are using Linux on x86). We are left with one last option, and that is re-engineering our application to distribute load across several instances of postgres which are operating without any real knowledge of eachother. I worry, though, that as our needs increase further, these application redesigns will become asymptotic. I find myself wondering what other people are doing with postgres that this doesn't seem to have come up. When one searches for postgres clustering on google, they will find lots of HA products. However, nobody seems to be attempting to create very high throughput clusters. I feel that it would be a very good thing if some thinking on this subject was done. In the future, people will hopefully begin using postgres for more intense applications. We are looking at perhaps many tens of billions of transactions per day within the next year or two. To simply buy a bigger box each time we outgrow the one we're on is not effective nor efficient. I simply don't believe we're the only ones pushing postgres this hard. I understand there are many applications out there trying to achieve replication. Some of them seem fairly promising. However, it seems to me that if we want to see a true clustered database environment, there would have to be actual native support in the postmaster (inter postmaster communication if you will) for replication and cross-instance locking. This is obviously a complicated problem, and probably not very many of us are doing anything near as large-scale as this. However, I am sure most of us can see the benefit of being able to provide support for these sorts of applications. I've just submitted this RFC in the hopes that we can discuss both the best way to support very large scale databases, as well as how to handle them presently. Thanks again for your time. alex -- [EMAIL PROTECTED] Alex J. Avriette, Solaris Systems Masseur I ... remain against the death penalty because I feel that eternal boredom with no hope of parole is a much worse punishment than just ending it all mercifully with that quiet needle. - Rachel Mills, NC Libertarian Gubernatorial Candidate ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Transaction aborts on syntax error.
Greg Stark [EMAIL PROTECTED] writes: So you picture the backend automatically introducing a mini-nested-transaction for every request and automatically rolling that back on any error. So the application or user wouldn't have to do anything to continue processing ignoring the error? You're assuming a bunch of facts not in evidence about how we choose to present this functionality to clients, including a rather dubious assumption that we'd choose to break backward compatibility. My guess is that there will be some way to get the above behavior (possibly implemented by client-library code rather than the backend), but that it won't be the default. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint
Bruce Momjian wrote: Jan Wieck wrote: Tom Lane wrote: Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes: So Imho the target should be to have not much IO open for the checkpoint, so the fsync is fast enough, even if serial. The best we can do is push out dirty pages with write() via the bgwriter and hope that the kernel will see fit to write them before checkpoint time arrives. I am not sure if that hope has basis in fact or if it's just wishful thinking. Most likely, if it does have basis in fact it's because there is a standard syncer daemon forcing a sync() every thirty seconds. Looking at the response time charts I did for showing how vacuum delay is doing, it seems at least on Linux there is hope that that is the case. Those charts have just a regular 5 minute checkpoint with enough checkpoint segments for that, and no other sync effort done at all. The system has a hard time to handle a larger scaled test DB, so it is definitely well saturated with IO. The charts are here: http://developer.postgresql.org/~wieck/vacuum_cost/ That means that instead of an I/O storm every checkpoint interval, we get a smaller I/O storm every 30 seconds. Not sure this is a big improvement. Jan already found out that issuing very frequent sync()s isn't a win. In none of those charts I can see any checkpoint caused IO storm any more. Charts I'm currently doing for 7.4.1 show extremely clear spikes at checkpoints. If someone is interested in those as well I will put them up. So, Jan, are you basically saying that the background writer has solved the checkpoint I/O flood problem, and we just need to deal with changing sync to multiple fsync's at checkpoint? ISTM that the background writer at least has the ability to lower the impact of a checkpoint significantly enough that one might not care about it any more. Has the ability means, it needs to be adjusted to the actual DB usage. The charts I produced where not done with the default settings, but rather after making the bgwriter a bit more agressive against dirty pages. The whole sync() vs. fsync() discussion is in my opinion nonsense at this point. Without the ability to limit the amount of files to a reasonable number, by employing tablespaces in the form of larger container files, the risk of forcing excessive head movement is simply too high. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] RFC: Security documentation
On Sun, Feb 08, 2004 at 09:34:15PM -0500, Tom Lane wrote: Is this nothing? http://www.postgresql.org/docs/7.4/static/libpq-exec.html#LIBPQ-EXEC-ESCAPE-STRING I don't think the docs are nearly as bereft of security-related items as you claim. They may be scattered and poorly indexed, but they're there. Tom, I think this is largely a semantic issue. If documentation exists, but is difficult to find, or stored in such a way as to not be quickly available to somebody looking for it, it isn't useful. While not nothing as such, it doesn't count for much. I've liked what I've heard so far in this thread. Is there a consensus that some documentation could be added regarding security? If we can agree on that, I would be happy to start doing some collating of data on the subject. Could it go in the distributed documentation? I know there was some debate as to whether it belonged in the docs themselves, or in techdocs. Personally, I feel that distributing it in the main documentation would be preferable. However, I don't have any particular allegiance to that method; I mostly look for answers to questions via google first. If the docs were included on techdocs, google would find them soon enough. I suppose, also, anyone who was interested in securing their database would look a little further than the included documentation. Opinions? Alex -- [EMAIL PROTECTED] Alex J. Avriette, Shepherd of wayward Database Administrators We are paying through the nose to be ignorant. - Larry Ellison ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] RFC: Very large scale postgres support
Keith Bottner wrote: Alex, I agree that this is something that is worth spending time on. This resembles the Oracle RAC (Real Application Cluster). While other people may feel that the amount of data is unreasonable I have a similar problem that will only be solved using such a solution. In regards to how your database is designed? Who cares? This is an RFC for a general discussion on how to design this level of functionality into Postgres. IMHO a general discussion isn't too helpful, you might be discussing stuff that's never needed for PostgreSQL. Different database systems give different solutions to the same problem, as you might see from e.g. table partition discussions, which where initiated by Oracle-originating people. There still might be weaknesses in pgsql, but to identify them, *real* issues need to be discussed. This is necessary to avoid major hardware/software dbms efforts that might well be replaced by organizational/app level tools. Regards, Andreas ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] RFC: Very large scale postgres support
I always enjoy how everyone wants to talk about using different solutions prior to understanding the complete problem. I would say that a *real* issue is any perceived issue whether a current solution exists or not. If current solutions are applicable and would work then great we have all gained; however, if in fact it is necessary to add the additional functionality to Postgres then so be it. But none of this can be decided until the complete problem and hence the requirements are understood. My impression of the Postgres project has always been that of a high end database system that is endeavoring to become a component of critical enterprise systems. If this is not true or as a group we are going to keep placing these scalability issues aside then this will never be achieved and those of us who want Postgres to play a more important role for our corporate systems will have no choice but to go somewhere else. I understand your position Andreas and respect your opinion; maybe what I have identified as requirements is what you are specifying as *real* issues. I hope so, because I to would like to avoid unnecessary dbms efforts. But from what I understand of Alex's problem and more specifically mine, adding another layer at the organizational/app level will not provide the level of functionality that is required. Regards, Keith -Original Message- From: Andreas Pflug [mailto:[EMAIL PROTECTED] Sent: Monday, February 09, 2004 9:54 AM To: Keith Bottner Cc: 'Alex J. Avriette'; [EMAIL PROTECTED] Subject: Re: [HACKERS] RFC: Very large scale postgres support Keith Bottner wrote: Alex, I agree that this is something that is worth spending time on. This resembles the Oracle RAC (Real Application Cluster). While other people may feel that the amount of data is unreasonable I have a similar problem that will only be solved using such a solution. In regards to how your database is designed? Who cares? This is an RFC for a general discussion on how to design this level of functionality into Postgres. IMHO a general discussion isn't too helpful, you might be discussing stuff that's never needed for PostgreSQL. Different database systems give different solutions to the same problem, as you might see from e.g. table partition discussions, which where initiated by Oracle-originating people. There still might be weaknesses in pgsql, but to identify them, *real* issues need to be discussed. This is necessary to avoid major hardware/software dbms efforts that might well be replaced by organizational/app level tools. Regards, Andreas ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] RFC: Very large scale postgres support
Keith Bottner wrote: I understand your position Andreas and respect your opinion; maybe what I have identified as requirements is what you are specifying as *real* issues. I hope so, because I to would like to avoid unnecessary dbms efforts. You got me very right. I didn't mean to declare high volume databasing as minor issue, I meant real world requirements. Regards, Andreas ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Proposed Query Planner TODO items
On 6 Feb, To: [EMAIL PROTECTED] wrote: On 5 Jan, Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: 2) DEVELOP BETTER PLANS FOR OR GROUP QUERIES Summary: Currently, queries with complex or group criteria get devolved by the planner into canonical and-or filters resulting in very poor execution on large data sets. We should find better ways of dealing with these queries, for example UNIONing. Description: While helping OSDL with their derivative TPC-R benchmark, we ran into a query (#19) which took several hours to complete on PostgreSQL. http://developer.osdl.org/markw/dbt3-pgsql/ There's a short summary of the tests I ran over the weekend, with links to detailed retults. Comparing runs 43 (7.4) and 52 (7.5devel), it looks like query #7 had the only significant improvement. Oprofile data should be there too, if that'll help. Let us know if there's anything else we can try for you. Mark ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] psql variables
Neil Conway wrote: Perhaps you're suggesting shell variables were used as the design model for psql's variables (although I can't be sure, you didn't elaborate). If so, what I'm saying is that this model is not very friendly for setting psql-internal options, and we'd be better changing it as I've elaborated on elsewhere. Do you agree? Shell variables did serve as a design model, mostly because I found it better to use *some* model rather than inventing behavior out of thin air. Consequently, I am sort of biased on this. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint
Jan Wieck [EMAIL PROTECTED] writes: The whole sync() vs. fsync() discussion is in my opinion nonsense at this point. The sync vs fsync discussion is not about performance, it is about correctness. You can't simply dismiss the fact that we don't know whether a checkpoint is really complete when we write the checkpoint record. I liked the idea put forward by (I think) Kevin Brown, that we issue sync to start the I/O and then a bunch of fsyncs to wait for it to finish. If sync behaves per spec (all the I/O is scheduled upon return) then the fsyncs will not affect I/O ordering in the least. But they will ensure that we don't proceed until the I/O is all done. Also there is the Windows-port problem of not having sync available. Doing the fsyncs only will provide an adequate, if possibly lower-performing, solution there. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Proposed Query Planner TODO items
[EMAIL PROTECTED] writes: I'll see what I can do about the explain and explain analyze results. I remember in the past that someone said it would be most interesting to execute the latter while the test while running, as opposed to before or after a test. Should I do that here too? If possible, but I'd settle for a standalone result, so long as it's executed against the correct database contents (including pg_statistic settings). regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Proposed Query Planner TODO items
On 9 Feb, Tom Lane wrote: [EMAIL PROTECTED] writes: http://developer.osdl.org/markw/dbt3-pgsql/ There's a short summary of the tests I ran over the weekend, with links to detailed retults. Comparing runs 43 (7.4) and 52 (7.5devel), it looks like query #7 had the only significant improvement. Oprofile data should be there too, if that'll help. Let us know if there's anything else we can try for you. I couldn't figure out anything at all from that, possibly because many of the links are dead, eg the task descriptions. I don't even see where you see the time for query #7. What would be interesting from my perspective is explain results (or even better, explain analyze results) for the problem queries. Any chance of extracting such a thing? Sorry about the task links, I think I've got that corrected. I'll see what I can do about the explain and explain analyze results. I remember in the past that someone said it would be most interesting to execute the latter while the test while running, as opposed to before or after a test. Should I do that here too? Mark ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Proposed Query Planner TODO items
[EMAIL PROTECTED] writes: http://developer.osdl.org/markw/dbt3-pgsql/ There's a short summary of the tests I ran over the weekend, with links to detailed retults. Comparing runs 43 (7.4) and 52 (7.5devel), it looks like query #7 had the only significant improvement. Oprofile data should be there too, if that'll help. Let us know if there's anything else we can try for you. I couldn't figure out anything at all from that, possibly because many of the links are dead, eg the task descriptions. I don't even see where you see the time for query #7. What would be interesting from my perspective is explain results (or even better, explain analyze results) for the problem queries. Any chance of extracting such a thing? regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 7.4.1 release status - Turkish Locale
Tom Lane [EMAIL PROTECTED] writes: It seems to me that that's too narrow a definition of the problem. I think we should state our goal as we don't want bizarre locale definitions to interfere with downcasing of the basic ASCII letters. If we put in a special case for 'I' we will fix the known problem with Turkish, but what other strange locales might be out there? And if we don't trust tolower() for 'I', why should we trust it for 'A'-'Z'? But then wouldn't it be a little weird for Turkish table and column names to treat I and Ý (I think that's a dotted capital I) as equivalent to i instead of ý i respectively. (I think that first one was a dotless i). Perhaps what really ought to be happening is that the downcasing should be done separately for keywords, or postponed until the point where it's checked to see if it's a keyword. Then it could be done using an entirely ascii-centric bit-twiddling implementation. If it matches an SQL keyword after being downcased the old fashioned way, then it's an SQL keyword. If not then the locale-aware tolower() would be appropriate for tables, columns, etc. But then perhaps that's unnecessarily complex. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Proposed Query Planner TODO items
On 9 Feb, Josh Berkus wrote: Mark, Ok, I've found that the kit does capture explain results and I've added a Query Plans links under the query time charts on each of the pages. Um, but I did notice a couple of problems. It looks liks one of the 22 queries is missing and they're not labeled. I'll see about getting that fixed. If #19 is missing it's because Oleg I could not get it to complete. That was also the query which we are most interested in testing. Oh, it's probably because we've altered Q19 and Q20. I'm still not all that familiar with this kit, so I'm learning as we go. So we need to change it back to make it worthwhile for you. Mark ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] Question on pg_dump
I'm running Postgres v.7.3.4. In my database dump file I see this: CREATE FUNCTION plpgsql_call_handler () RETURNS language_handler AS '/home/tmichael/build/relipg21/syncinc/lib.sol2/plpgsql', 'plpgsql_call_handler' LANGUAGE c; The hardcoded library path may become an obstacle when loading data into a different server. Is there a way to avoid this? Thank you, Mike. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] psql variables
Peter Eisentraut [EMAIL PROTECTED] writes: Shell variables did serve as a design model, mostly because I found it better to use *some* model rather than inventing behavior out of thin air. Consequently, I am sort of biased on this. It does seem worth pointing out that shell variables have acted the way they do for a long time. I haven't heard all that many people complaining about it, so I'm unconvinced that the behavior is broken. I'm definitely against breaking backwards compatibility to improve it in psql. I might be willing to support an optional (not-default) mode with more error checking, though. What did you think of the \declare foo idea? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 7.4.1 release status - Turkish Locale
Greg Stark [EMAIL PROTECTED] writes: If it matches an SQL keyword after being downcased the old fashioned way, then it's an SQL keyword. If not then the locale-aware tolower() would be appropriate for tables, columns, etc. That's exactly what we do already. The complaint was that the locale-aware downcasing is broken (not to put it too finely) in Turkish locales, leading to unexpected/unwanted results for identifiers that are not keywords. My own opinion is that the correct response is to fix the Turkish locale tables, but I can see where that might be beyond the skills of the average Postgres user. Thus I thought a reasonable compromise would be to override the locale for the handling of A-Z, allowing it to determine what happens to high-bit-set characters only. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Proposed Query Planner TODO items
Josh Berkus [EMAIL PROTECTED] writes: If #19 is missing it's because Oleg I could not get it to complete. That was also the query which we are most interested in testing. Q19 doesn't seem to be particularly slow in either the 7.4 or 7.5 tests --- there are many others with longer runtimes. I speculate that what is actually being run here is a modified Q19 query with the merge join condition pulled out by hand. The CVS-tip planner should be able to do that for itself, though, and obtain essentially this same performance with the per-spec query. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Question on pg_dump
Michael Brusser [EMAIL PROTECTED] writes: I'm running Postgres v.7.3.4. In my database dump file I see this: CREATE FUNCTION plpgsql_call_handler () RETURNS language_handler AS '/home/tmichael/build/relipg21/syncinc/lib.sol2/plpgsql', 'plpgsql_call_handler' LANGUAGE c; The hardcoded library path may become an obstacle when loading data into a different server. Is there a way to avoid this? The preferred way to write it nowadays is '$libdir/plpgsql', but you evidently have a legacy value embedded in your pg_proc table. pg_dump will not second-guess this, and so the old full-path approach will persist over dump/reloads until you do something about it. I'd suggest editing the dump file before you reload, or even manually updating pg_proc.probin for this function entry so that future dumps are right. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Preventing duplicate vacuums?
On Sat, 2004-02-07 at 02:07, Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: Don't know if I would agree for sure, but i the second vacuum could see that it is being blocked by the current vacuum, exiting out would be a bonus, since in most scenarios you don't need to run that second vacuum so it just ends up wasting resources (or clogging other things up with it lock) This would be reasonable if we could do it, but the present lock manager doesn't provide any way to tell what sort of lock is blocking you. There are some cases in which it isn't obvious anyway. For instance, suppose an ALTER TABLE (which wants an exclusive lock) is queued up waiting for the currently-running VACUUM. An incoming new VACUUM request will queue behind the ALTER. Which lock would you say is blocking it ... and does an honest answer to that question jibe with your preference about whether the second VACUUM should give up? ISTM that both sides have trouble, since you could just as easily have vacuum queued up behind an alter we your second vacuum comes in... A chintzy way out would be for VACUUM to just exit if it can't immediately acquire lock, regardless of the cause. This wouldn't be too useful for VACUUM FULL, but plain VACUUM is not blocked by very many common operations other than another VACUUM, so most of the time it would do what you want. I could possibly be talked into supporting an option to do that. This seems pretty useful to me. I thought about doing things like setting statement_timeout to some low number but that would generally cause the vacuum to timeout as well. Looking through postgresql.conf nothing else seems to apply... ISTR people asking for a general lock_timeout param that would cancel queries if they wait for a lock longer than x milliseconds... this seems like very similar functionality... Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint
Jan Wieck [EMAIL PROTECTED] writes: The whole sync() vs. fsync() discussion is in my opinion nonsense at this point. Without the ability to limit the amount of files to a reasonable number, by employing tablespaces in the form of larger container files, the risk of forcing excessive head movement is simply too high. I don't think there was any suggestion of conflating tablespaces with implementing a filesystem in postgres. Tablespaces are just a database entity that database stored objects like tables and indexes are associated to. They group database stored objects and control the storage method and location. The existing storage mechanism, namely a directory with a file for each database object, is perfectly adequate and doesn't have to be replaced to implement tablespaces. All that's needed is that the location of the directory be associated with the tablespace of the object rather than be a global constant. Implementing an Oracle-style filesystem is just one more temptation to reimplement OS services in the database. Personally I think it's an awful idea. But even if postgres did it as an option, it wouldn't necessarily have anything to do with tablespaces. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] BYTE_ORDER for contribs
Is there a quick way to use the BYTE_ORDER define as set by pgsql ? I can't find an entry point include for it. It's needed for postgis (problems with Solaris BYTE_ORDER). --strk; ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Two-phase commit
On Mon, Feb 09, 2004 at 10:09:34PM +0200, Heikki Linnakangas wrote: However, if this gets into 7.5, I guess you could just check for the version of the backend instead with SELECT version(). Hey, that works? That's very good news, because I was getting a bit worried about all the things I want to do in libpqxx that may depend on the Postgres version... Thanks! ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] CVS HEAD compile failure on Freebsd 4.9
Yes, here too... There are no headers included for struct timeval and function select in miscadmin.h adding #include fixes the problem on freebsd, but I'm sure it's not a portable solution... Index: miscadmin.h === RCS file: /home/src/pgsql/repo/pgsql-server/src/include/miscadmin.h,v retrieving revision 1.152 diff -c -r1.152 miscadmin.h *** miscadmin.h 8 Feb 2004 22:28:57 - 1.152 --- miscadmin.h 9 Feb 2004 21:21:37 - *** *** 60,65 --- 60,69 * */ + #include sys/types.h + #include sys/time.h + #include sys/select.h + /* in globals.c */ /* these are marked volatile because they are set by signal handlers: */ extern DLLIMPORT volatile bool InterruptPending; On Mon, Feb 09, 2004 at 04:52:54PM +1300, Mark Kirkwood wrote: Fresh checkout of CVS HEAD yesterday, updated today : $ ./configure --prefix=/usr/local/pgsql/7.5 $ make gmake[4]: Entering directory `/usr/home/postgres/develop/c/pgsql/src/backend/access/nbtree' gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes -Wmissing-declarations -I../../../../src/include -c -o nbtree.o nbtree.c nbtree.c: In function `btbulkdelete': nbtree.c:600: storage size of `_delay' isn't known nbtree.c:600: warning: implicit declaration of function `select' nbtree.c:600: warning: unused variable `_delay' nbtree.c:602: storage size of `_delay' isn't known nbtree.c:602: warning: unused variable `_delay' gmake[4]: *** [nbtree.o] Error 1 gmake[4]: Leaving directory `/usr/home/postgres/develop/c/pgsql/src/backend/access/nbtree' gmake[3]: *** [nbtree-recursive] Error 2 gmake[3]: Leaving directory `/usr/home/postgres/develop/c/pgsql/src/backend/access' gmake[2]: *** [access-recursive] Error 2 gmake[2]: Leaving directory `/usr/home/postgres/develop/c/pgsql/src/backend' gmake[1]: *** [all] Error 2 gmake[1]: Leaving directory `/usr/home/postgres/develop/c/pgsql/src' gmake: *** [all] Error 2 *** Error code 2 $ uname -a FreeBSD spiney 4.9-RELEASE FreeBSD 4.9-RELEASE #3 kernel customizations : i686 (i.e 386-586 de-selected for kernel build) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] 7.4.1 release status - Turkish Locale
Tom Lane [EMAIL PROTECTED] writes: Greg Stark [EMAIL PROTECTED] writes: If it matches an SQL keyword after being downcased the old fashioned way, then it's an SQL keyword. If not then the locale-aware tolower() would be appropriate for tables, columns, etc. That's exactly what we do already. The complaint was that the locale-aware downcasing is broken (not to put it too finely) in Turkish locales, leading to unexpected/unwanted results for identifiers that are not keywords. But the example given was SERIAL. serial is an English word, not a Turkish word. It shouldn't really be subject to Turkish locale effects at all. Perhaps keyword wasn't the right word in my message. I'm wondering if he really expects all identifiers to be subject to this ascii downcasing. Like, if he had a GÜNAYDIN column he might be surprised to when günaydýn (where ý is the lowercase dotless i) says column günaydýn doesn't exist. Or is the real problem simply that both styles of i really ought to match all the time, ie, that they should really be considered the same letter for matches? I wonder if there are other locales where that's an issue. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 7.4.1 release status - Turkish Locale
Greg Stark [EMAIL PROTECTED] writes: But the example given was SERIAL. serial is an English word, not a Turkish word. It shouldn't really be subject to Turkish locale effects at all. SERIAL is not a keyword according to the grammar. Neither are PUBLIC, VOID, INT4, and numerous other examples. It's not appropriate to try to fix this by making them all keywords --- that will just create other problems. (And where do you draw the line, anyway? Should every identifier present in the default system catalogs become a keyword?) I'm wondering if he really expects all identifiers to be subject to this ascii downcasing. Without doubt it isn't ideal, but if we don't do something then a lot of stuff starting with initdb is broken. We could perhaps work around the problem by spelling everything in lower-case in all the commands we issue, but I can't see that as an acceptable answer either. We can't expect to control all the SQL sent to a database. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Proposed Query Planner TODO items
Jenny, For 19, we moved the common conditions out of the big ORs, for 20, we added distinct. We can change the query back if the optimizer can handle it now. Well, we want to test if it can. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint
Jan Wieck [EMAIL PROTECTED] writes: Doing this is not just what you call it. In a system with let's say 500 active backends on a database with let's say 1000 things that are represented as a file, you'll need half a million virtual file descriptors. [shrug] We've been dealing with virtual file descriptors for years. I've seen no indication that they create any performance bottlenecks. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] CVS HEAD compile failure on Freebsd 4.9
Tom Lane wrote: Vsevolod Lobko [EMAIL PROTECTED] writes: Yes, here too... There are no headers included for struct timeval and function select in miscadmin.h Certain parties who shall remain nameless have been gratuitously scattering dependencies on select() into lots of modules that don't need them. (When you are about to call the kernel to implement a multi-millisecond delay, there is no value in saving a nanosecond by inlining the code for it...) I plan to fix that later tonight if no one beats me to it. The scattering isn't actually from Win32, which is where I thought it came from. It is from the PG_DELAY call that was added with: revision 1.137 date: 2003/11/13 14:57:15; author: wieck; state: Exp; lines: +8 -1 2nd try for the ARC strategy. I added a couple more Assertions while tracking down the exact cause of the former bug. All 93 regression tests pass now. Jan What Win32 has done is to centralize all delay calls around that function, now renamed PG_USLEEP/PG_MSLEEP. I was going to add the missing includes but if you want to centralize it, I think that makes more sense. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] CVS HEAD compile failure on Freebsd 4.9
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: What Win32 has done is to centralize all delay calls around that function, now renamed PG_USLEEP/PG_MSLEEP. I was going to add the missing includes but if you want to centralize it, I think that makes more sense. Yeah, I'm planning to add a pgsleep module in src/port. Might as well do it right. Agreed. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Transaction aborts on syntax error.
Gavin Sherry wrote: Its not that there's a rationale behind it. Rather, the existing error handling code *has* to abort the current transaction because an error has taken place. In a multi statement transaction block (ie, BEGIN; ...; ...; ... COMMIT;) each statement piggy backs on onto the whole transaction. Because we're aborted one query, we've aborted them all. With nested transactions, every query within a transaction block could be run within its own (sub)transaction. The backend could be jigged so that if parse errors occur, we abort the second level transaction and roll back to the start point at the moment before the error generating statement took place. This keeps the rest of the queries executed in the transaction block in place Who is currently working on this [nested transactions] and what specifically needs to be done at this point? This is a major bug which greatly diminishes the confidence of my co-workers in postgresql. I don't don't have a wealth of knowledge about RDBMS implementations. How can I best contribute to solve this problem? Andrej ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] [HACKERS] dollar quoting
Tom Lane wrote: A bigger problem here: + else if (!dol_quote line[i] == '$' + !isdigit(line[i + thislen]) + (dol_end = strchr(line+i+1,'$')) != NULL + (i == 0 || + ! ((line[i-1] 0x80) != 0 || isalnum(line[i-1]) || + line[i-1] == '_'))) + { is that you aren't checking that what comes between the two dollar signs looks like empty-or-an-identifier. The check for next-char-isn't-a-digit is part of that but not the only part. Well, I think the right way to do a full check would be with a regex, which I had hoped to avoid. However, I will now try to get one working and to address your other concerns. Thanks for the comments. cheers andrew ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] [HACKERS] dollar quoting
I think the attached patch addresses Tom's comments. I ended up not using a regex, which seemed to be a little heavy handed, but just writing a small custom recognition function, that should (and I think does) mimic the pattern recognition for these tokens used by the backend lexer. This patch just puts that function in mainloop.c, but perhaps it belongs elsewhere (string_utils.c maybe?). I don't have strong opinions on that. Enjoy andrew Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Comments welcome. Reviewers: I am not sure I got multi-byte stuff right in psql/mainloop.c - please pay close attention to that. The i-1 stuff should generally be i-prevlen. Not sure if there are any other pitfalls. A bigger problem here: + else if (!dol_quote line[i] == '$' + !isdigit(line[i + thislen]) + (dol_end = strchr(line+i+1,'$')) != NULL + (i == 0 || + ! ((line[i-1] 0x80) != 0 || isalnum(line[i-1]) || + line[i-1] == '_'))) + { is that you aren't checking that what comes between the two dollar signs looks like empty-or-an-identifier. The check for next-char-isn't-a-digit is part of that but not the only part. Also I'm not sure about the positioning of these tests relative to the in_quote and in_xcomment tests. As you have it, $foo$ will be recognized within an xcomment, which I think is at variance with the proposed backend lexing behavior. Also, the strdup should be pg_strdup. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] Index: src/backend/parser/scan.l === RCS file: /projects/cvsroot/pgsql-server/src/backend/parser/scan.l,v retrieving revision 1.112 diff -c -w -r1.112 scan.l *** src/backend/parser/scan.l 29 Nov 2003 19:51:52 - 1.112 --- src/backend/parser/scan.l 9 Feb 2004 15:26:34 - *** *** 39,44 --- 39,46 static intxcdepth = 0;/* depth of nesting in slash-star comments */ + static char*dolqstart; /* current $foo$ quote start string */ + /* * literalbuf is used to accumulate literal values when multiple rules * are needed to parse a single literal. Call startlit to reset buffer *** *** 95,100 --- 97,103 * xd delimited identifiers (double-quoted identifiers) * xh hexadecimal numeric string * xq quoted strings + * dolq $foo$-style quoted strings */ %x xb *** *** 102,107 --- 105,111 %x xd %x xh %x xq + %x dolq /* Bit string * It is tempting to scan the string for only those characters *** *** 141,146 --- 145,159 xqoctesc [\\][0-7]{1,3} xqcat {quote}{whitespace_with_newline}{quote} + /* $foo$ style quotes (dollar quoting) + * The quoted string starts with $foo$ where foo is an optional string + * in the form of an identifier, except that it may not contain $, + * and extends to the first occurrence + * of an identical string. There is *no* processing of the quoted text. + */ + dolqdelim \$([A-Za-z\200-\377][A-Za-z\200-\377_0-9]*)?\$ + dolqinside [^$]+ + /* Double quote * Allows embedded spaces and other special characters into identifiers. */ *** *** 387,392 --- 400,434 } xqEOF { yyerror(unterminated quoted string); } + {dolqdelim} { + token_start = yytext; + dolqstart = pstrdup(yytext); + BEGIN(dolq); + startlit(); + } + dolq{dolqdelim} { + if (strcmp(yytext, dolqstart) == 0) + { + pfree(dolqstart); + BEGIN(INITIAL); + yylval.str = litbufdup(); + return SCONST; + } + /* + * When we fail to match $...$ to dolqstart, transfer + * the $... part to the output, but put back the final + * $ for rescanning. Consider $delim$...$junk$delim$ + */ + addlit(yytext, yyleng-1); + yyless(yyleng-1); + } + dolq{dolqinside} { + addlit(yytext, yyleng); + } + dolq. { + addlitchar(yytext[0]); + } + dolqEOF { yyerror(unterminated special-quoted string); } {xdstart} { token_start = yytext; Index: src/bin/psql/mainloop.c === RCS file: /projects/cvsroot/pgsql-server/src/bin/psql/mainloop.c,v retrieving revision 1.61 diff -c -w -r1.61 mainloop.c *** src/bin/psql/mainloop.c 25 Jan 2004 03:07:22 - 1.61 --- src/bin/psql/mainloop.c 9 Feb 2004 15:26:51 - *** *** 21,26 --- 21,61 sigjmp_bufmain_loop_jmp; #endif + /* + * function to detect a valid $foo$ quote delimiter at the start
Re: [HACKERS] Linking the previously separated documentation
Michael Glaesemann wrote: If I'm understanding you correctly, that's what I'm doing. Here's an example of the change: Original: Please familiarize yourself with the citerefentryrefentrytitlepg_dump/ reference page. Revised: Please familiarize yourself with the citerefentryrefentrytitlexref linkend=APP-PGDUMP// reference page. If you write xref then you already say that this is a cross-reference, so you don't need the citerefentry element. (citerefentry is typically used to automatically link all occurences to the location of the manpage. For instance, FreeBSD uses it to link all refentry citations to their online manpage gateway.) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Linking the previously separated documentation
On Feb 9, 2004, at 6:23 PM, Peter Eisentraut wrote: Michael Glaesemann wrote: If I'm understanding you correctly, that's what I'm doing. Here's an example of the change: Original: Please familiarize yourself with the citerefentryrefentrytitlepg_dump/ reference page. Revised: Please familiarize yourself with the citerefentryrefentrytitlexref linkend=APP-PGDUMP// reference page. If you write xref then you already say that this is a cross-reference, so you don't need the citerefentry element. (citerefentry is typically used to automatically link all occurences to the location of the manpage. For instance, FreeBSD uses it to link all refentry citations to their online manpage gateway.) Thanks, Peter! Good to know. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org