Re: [HACKERS] Kerberos as source of user name? (Re: [BUGS] segfault in psql on x86_64)
Tom Lane [EMAIL PROTECTED] writes: Orion Henry [EMAIL PROTECTED] writes: It appears to be faulting on a kerberos call which is odd because I don't use kerberos for anything. I was a bit surprised to realize that if you compile Kerberos support at all, libpq will try to get a user name from Kerberos in preference to using getpwuid(). This strikes me as odd and surprising behavior. There's certainly no security reason for it, since we are only getting a default user name that can be trivially overridden. Harumph. I reported this about a year ago: http://archives.postgresql.org/pgsql-general/2002-12/msg00740.php I'm not sure it can be fixed by just not setting the default username though. In fact I think there's something a little backwards about deciding on a default username in advance and then trying various authentication methods. In my case I have a kerberos principal [EMAIL PROTECTED] and a local username of stark. It seems like it should try to do the kerberos authentication as username gsstark (or even [EMAIL PROTECTED] since the realm is significant). And if that fails, then it should try to log in as stark using unix userid authentication. The only fear I have with that direction is that it makes things a bit unpredictable. I could see it being weird having scripts randomly fail because they logged in as the wrong user if the tickets happened to have expired or the network goes down. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] session persistent data for plperl
The attached tiny patch (not intended for application yet) provides a space for plperl functions to create and share session persistent data, which I should think would increase the utility of plperl. Essentially it creates a hash called %session_globals which it then injects into the safe container where plperl functions live. Comments are welcome - this is just a proof of concept. If this seems good to people, I will try to use a similar mechanism to register plperl functions so they can call each other. cheers andrew (stupid) example use (using dollar quoting in psql ;-) ): andrew=# create function perlfunc() returns text language plperl as $$ andrew$# if (!exists $session_globals{x}) { $session_globals{x} = 'abcxyz'; } andrew$# return $session_globals{x}++; andrew$# $$; CREATE FUNCTION andrew=# select perlfunc(); perlfunc -- abcxyz (1 row) andrew=# select perlfunc(); perlfunc -- abcxza (1 row) andrew=# create function perlfunc2() returns text language plperl as $$ andrew$# if (!exists $session_globals{x}) { $session_globals{x} = 'abcxyz'; } andrew$# $session_globals{x} = reverse $session_globals{x}; andrew$# return $session_globals{x}++; andrew$# $$; CREATE FUNCTION andrew=# select perlfunc2(); perlfunc2 --- bzxcba (1 row) andrew=# select perlfunc2(); perlfunc2 --- bbcxzb (1 row) andrew=# select perlfunc(); perlfunc -- bbcxzc (1 row) ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] session persistent data for plperl
I wrote: The attached tiny patch (not intended for application yet) provides a space for plperl functions to create and share session persistent data, which I should think would increase the utility of plperl. Essentially it creates a hash called %session_globals which it then injects into the safe container where plperl functions live. ...and the patch is attached here Index: src/pl/plperl/plperl.c === RCS file: /projects/cvsroot/pgsql-server/src/pl/plperl/plperl.c,v retrieving revision 1.42 diff -c -w -r1.42 plperl.c *** src/pl/plperl/plperl.c 6 Jan 2004 23:55:19 - 1.42 --- src/pl/plperl/plperl.c 8 Feb 2004 17:18:18 - *** *** 207,215 * no commas between the next 5 please. They are supposed to be * one string */ ! require Safe; SPI::bootstrap(); sub ::mksafefunc { my $x = new Safe; $x-permit_only(':default');$x-permit(':base_math'); ! $x-share(qw[elog DEBUG LOG INFO NOTICE WARNING ERROR]); return $x-reval(qq[sub { $_[0] }]); } sub ::mkunsafefunc {return eval(qq[ sub { $_[0] } ]); } }; --- 207,215 * no commas between the next 5 please. They are supposed to be * one string */ ! require Safe; SPI::bootstrap(); use vars qw(%session_globals); sub ::mksafefunc { my $x = new Safe; $x-permit_only(':default');$x-permit(':base_math'); ! $x-share(qw[elog DEBUG LOG INFO NOTICE WARNING ERROR %session_globals]); return $x-reval(qq[sub { $_[0] }]); } sub ::mkunsafefunc {return eval(qq[ sub { $_[0] } ]); } }; ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] session persistent data for plperl
What about transactions? Let's say the first call was in a transaction that's been rolled back. Semantically, that means the call never happened. If you maintain the session data in a table, everything is fine of course. But if it's in memory the solution has some rather nasty implications to it. This is one scenario where I think it would be useful if the backend provided some transaction callback mechanisms (discussed in this group earlier under topic Transaction callback). That would enable the implementation of transaction aware session persistent data in memory. Regards, - thomas Andrew Dunstan [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] The attached tiny patch (not intended for application yet) provides a space for plperl functions to create and share session persistent data, which I should think would increase the utility of plperl. Essentially it creates a hash called %session_globals which it then injects into the safe container where plperl functions live. Comments are welcome - this is just a proof of concept. If this seems good to people, I will try to use a similar mechanism to register plperl functions so they can call each other. cheers andrew (stupid) example use (using dollar quoting in psql ;-) ): andrew=# create function perlfunc() returns text language plperl as $$ andrew$# if (!exists $session_globals{x}) { $session_globals{x} = 'abcxyz'; } andrew$# return $session_globals{x}++; andrew$# $$; CREATE FUNCTION andrew=# select perlfunc(); perlfunc -- abcxyz (1 row) andrew=# select perlfunc(); perlfunc -- abcxza (1 row) andrew=# create function perlfunc2() returns text language plperl as $$ andrew$# if (!exists $session_globals{x}) { $session_globals{x} = 'abcxyz'; } andrew$# $session_globals{x} = reverse $session_globals{x}; andrew$# return $session_globals{x}++; andrew$# $$; CREATE FUNCTION andrew=# select perlfunc2(); perlfunc2 --- bzxcba (1 row) andrew=# select perlfunc2(); perlfunc2 --- bbcxzb (1 row) andrew=# select perlfunc(); perlfunc -- bbcxzc (1 row) ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] RFC: Security documentation
Nigel J. Andrews [EMAIL PROTECTED] writes: On Sat, 7 Feb 2004, Alex J. Avriette wrote: ... or somebody may be passing in the de rigeur '; select * from sensitive_table; ...' attempts (this is very common, as you know, in CGI applications). Actually I can and it involves changing the backend to not permit multiple statements in one request. I can't imagine how that could sensibly be implemented, if at all, though. Actually, the extended-query message in the new FE/BE protocol works exactly that way. This was done for protocol-simplicity reasons not for security, but you could use it for that. The new protocol's ability to separate parameter values from SQL command is also useful for ensuring security. At some stage your interface code has to accept responsibility for preventing dangerous input from reaching libpq. However, I quite agree with that statement. The app programmer has to take responsibility for properly segregating or quoting data strings. We can (and do) provide tools to make this easier, but it's still the programmer's responsibility to use the tools correctly. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Kerberos as source of user name? (Re: [BUGS] segfault in psql on x86_64)
Greg Stark [EMAIL PROTECTED] writes: In fact I think there's something a little backwards about deciding on a default username in advance and then trying various authentication methods. Perhaps, but we're stuck with that without a massive (and non backwards compatible) redesign of the connection protocol. libpq has to send a connection-request packet that includes the username before it knows which auth method will be selected. There are people around here who consider it a feature that pg_hba.conf can base the decision which auth method to use on the supplied username... In my case I have a kerberos principal [EMAIL PROTECTED] and a local username of stark. AFAICS libpq doesn't have any very principled way to choose which of those to use as default username. But I'd prefer to see it make the same choice whether it's compiled with kerberos support or not. The present behavior doesn't seem to me to satisfy the principle of least astonishment. In your situation, if you wanted to log in using kerberos authentication then you'd probably end up setting PGUSER=gsstark to get the right thing to happen. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] session persistent data for plperl
Thomas Hallgren [EMAIL PROTECTED] writes: Andrew Dunstan [EMAIL PROTECTED] wrote: The attached tiny patch (not intended for application yet) provides a space for plperl functions to create and share session persistent data, What about transactions? AFAICS, Andrew is just emulating a feature that has always existed in pltcl. Insisting that the plperl version must have intelligence not found in the pltcl version seems like raising the bar unfairly. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] RFC: Very large scale postgres support
Alex J. Avriette wrote: I feel that it would be a very good thing if some thinking on this subject was done. In the future, people will hopefully begin using postgres for more intense applications. We are looking at perhaps many tens of billions of transactions per day within the next year or two. tens of billions =10e10 per day? This is probably a typo, because this would mean 100,000 requests per second? Do you want to feed a monitor with pixel data right from the database, using individual queries for each pixel? Or record each irc user's keyclick in the world concurrently online in a single database? Regards, Andreas ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] session persistent data for plperl
As Tom Lane noted, pltcl already allows this. In fact, it appears that plperl already allows functions to create their own session-persistent data, but doesn't share data between functions. I guess there's a philosophical question: should we apply transactional safety to (transient) data not stored in the database? I can see both sides of the argument, but have no motivation myself to provide such transactional safety, at least not now. cheers andrew Thomas Hallgren wrote: What about transactions? Let's say the first call was in a transaction that's been rolled back. Semantically, that means the call never happened. If you maintain the session data in a table, everything is fine of course. But if it's in memory the solution has some rather nasty implications to it. This is one scenario where I think it would be useful if the backend provided some transaction callback mechanisms (discussed in this group earlier under topic Transaction callback). That would enable the implementation of transaction aware session persistent data in memory. Regards, - thomas Andrew Dunstan [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] The attached tiny patch (not intended for application yet) provides a space for plperl functions to create and share session persistent data, which I should think would increase the utility of plperl. Essentially it creates a hash called %session_globals which it then injects into the safe container where plperl functions live. Comments are welcome - this is just a proof of concept. If this seems good to people, I will try to use a similar mechanism to register plperl functions so they can call each other. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] session persistent data for plperl
I'm not insisting anything. I merely suggest something that all pllang implementations would have a potential benefit from. But perhaps I should insist. Both with respect to plperl, pltcl, or any other pllang where session data is possible. IMHO, session data spanning more than one transcation is dangerous and error prone if it's not coordinated with the transactions. Even if one can argue that, it's the developers responsability to avoid the pitfalls, I still think its a bit harsh to just disregard the benefits such mechanisms could gain by transaction callback. Regards, - thomas Tom Lane [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Thomas Hallgren [EMAIL PROTECTED] writes: Andrew Dunstan [EMAIL PROTECTED] wrote: The attached tiny patch (not intended for application yet) provides a space for plperl functions to create and share session persistent data, What about transactions? AFAICS, Andrew is just emulating a feature that has always existed in pltcl. Insisting that the plperl version must have intelligence not found in the pltcl version seems like raising the bar unfairly. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] RFC: Very large scale postgres support
On Sun, Feb 08, 2004 at 08:07:14PM +0100, Andreas Pflug wrote: I feel that it would be a very good thing if some thinking on this subject was done. In the future, people will hopefully begin using postgres for more intense applications. We are looking at perhaps many tens of billions of transactions per day within the next year or two. tens of billions =10e10 per day? This is probably a typo, because this would mean 100,000 requests per second? Do you want to feed a monitor That's what I said, and what I meant. Ten billion transactions equates to 115,740 transactions per second. with pixel data right from the database, using individual queries for each pixel? Or record each irc user's keyclick in the world concurrently online in a single database? Just because you don't think there is a valid use for that sort of traffic doesn't mean there isn't one. Imagine, if you will, a hundred thousand agents making four to five requests a second. Now, imagine these requests are all going to the same database. I'll leave the rest of this exercise up to you. The fact is, there are situations in which such extreme traffic is warranted. My concern is that I am not able to use postgres in such situations because it cannot scale to that level. I feel that it would be possible to reach that level with support in the postmaster for replication. With software load balancing (eg rotors or similar) and updates between postmasters, it would be (it seems to me) possible to drastically increase the available capacity of a database installation through the addition of more nodes. This has the added benefit of allowing us to distribute network resources. Alex -- [EMAIL PROTECTED] Alex J. Avriette, Unix Systems Gladiator The Emperor Wears No Clothes. http://www.drugsense.org/wodclock.htm ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] psql tab completion USERSET vars
While reviewing Andrew Dunstan's patch that implements the log_disconnections GUC var, I noticed that tab-complete.c in psql claims that only USERSET and possibly SUSET GUC variables should be included in the list of variables that can be auto-completed. That policy is clearly not followed: fsync, max_connections, port, shared_buffers, ssl, wal_sync_method, and various other non-USERSET or SUSET variables are presently included in the completion list. The two obvious ways to fix this are to add the remaining variables to the completion list and remove the USERSET/SUSET policy, or to remove the variables that violate this policy. I think the former is a better idea, for two reasons: (1) tab-completing GUC variables that one cannot set is still useful for SHOW (2) enforcing semantic validity is something psql is clearly not in a position to do in the general case, so it strikes me as rather pointless to try Any comments? -Neil ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [PATCHES] update i386 spinlock for hyperthreading
Tom Lane Manfred's numbers definitely say that we need to find a way to break down the BufMgrLock into multiple finer-grain locks. We already have all those per-buffer LWLocks, but I don't see how to apply those to the problem of managing the global lookup and replacement datastructures. Anyone see an attack path here? Not specifically, but some ideas that might lead to something useful... Taking a leaf from somebody else's book is always the best way - much as has been done with the ARC algorithm itself. 1. Scaling Oracle8iT: Building Highly Scalable OLTP System Architectures James Morle ISBN: 0-201-32574-8 ...seems like an interesting purchase in this regard (I have no connection with the author) - but I can't vouch for usefulness of it. http://www.aw-bc.com/catalog/academic/product/0,4096,0201325748-TOC,00.h tml This is nothing to do with OPS or RAC, which I definitely do not advocate as an approach to the current challenges. 2. DB2 has long offered multiple BUFFERPOOLS. Why not explicitly define multiple pools and then assign specific database objects to them - each would have its own MRU list. ...that spurs some further off-the-wall ideas that follow the more usual PostgreSQL line design for automated operation - how to make use of multiple buffer pools without the need to manually assign database objects to them??? First a question: am I right in thinking that the problem is proportional to the number of independent actors (i.e. CPUs), but not dependant at all on the actual size of the cache? 3. Split the T1 list into 2 (maybe more?) pieces, completely independent of one another - T1a and T1b. When a T2, T1a, or T1b hit occurs, we *somehow* pick one of the T1 lists, in a pseudo random fashion and move it to the top of that list. This would clearly not be the best that ARC can achieve, but if the buffer (and therefore the list) is big enough, then the skew between the two lists might well be less than the loss of performance from having a hot single MRU list. This might be regarded as vertical partitioning. Jan Wieck writes: The theory of the whole ARC strategy (which is nothing more than four LRU's with a twist) is that the access frequency of blocks is reverse proportional to their numbers (a few are high frequency used, some are medium often requested, the vast majority rather seldom). That means, that the buffers in the upper third or quarter of the T1 queue (and that's the critical one) are basically circling around each other with frequent visitors from the lower regions or other queues. 4. Taking the analogy from (3) differently: partition T1 horizontally, like the floors of a building, or the league divisions in Football (Major/Minor? in the US, or 1st, 2nd, 3rd etc in the UK). The ARC algorithm is identical, but T1 effectively has multiple MRU points, or put another way, multiple lists all chained together. Say we give it 3 levels - T1.1 thru T1.3. When a block is first promoted from T2, it goes to the MRU of T1.3. When a hit occurs when it is in T1.3 it gets punted to the MRU of T1.2, and when a hit occurs when in T1.2 it would get promoted to T1.1 MRU. On the way down, blocks falling off of T1.1 would go to T1.2 then T1.3. The T1.1 MRU would still be hotter than the rest. The way I am thinking about this now is that the MRU points are evenly spaced, so the ARC algorithm doesn't dynamically resize them as it does with T1/T2, but that could change also. Doing things this way might mean that ARC doesn't have to remember which transaction id added it - a specific ARC tailoring feature added for PostgreSQL, since even if an UPDATE/DELETE touches it twice, it won't move very far up the list. Would that save time on the list reordering code? 4a) Perhaps MRU points should be spaced as 1/6, 1/3, 1/2 of the list, or someother fairly simple but non-linear way in an attempt to balance the hit frequency of the MRU points. Perhaps use four levels, then split 1/16, 2/16, 4/16, 8/16 (so last level if 9/16ths of list). 4b) Combine this with idea (3) above to split that list into two or more equal sized lists, T1.1a and T1.1b. 4c) Promote blocks differently, according to their content type? Put index non-leaf blocks (from fastroot down) the rightmost leaf block that are in T1 straight into T2.1, put permanent table heap blocks index leaf blocks into T2.2 and put temporary objects into T2.3. Would require tasting the block or enquiring details about its parent object rather than handling it blind - without regard to its contents, as ARC does now - that sounds expensive. Hope some of that helps.. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] RFC: Security documentation
On Sun, Feb 08, 2004 at 01:33:31PM -0500, Tom Lane wrote: Actually I can and it involves changing the backend to not permit multiple statements in one request. I can't imagine how that could sensibly be implemented, if at all, though. Actually, the extended-query message in the new FE/BE protocol works exactly that way. This was done for protocol-simplicity reasons not for security, but you could use it for that. The new protocol's ability to separate parameter values from SQL command is also useful for ensuring security. (Tom is referring to this: http://archives.postgresql.org/pgsql-interfaces/2003-03/msg00017.php) How would you suggest implementing this? Having a no subqueries setting? Asking the postmaster to throw an exception on queries-within-data? I can think of several ways to do it, but I'd like to know what you had in mind. At some stage your interface code has to accept responsibility for preventing dangerous input from reaching libpq. However, I quite agree with that statement. The app programmer has to take responsibility for properly segregating or quoting data strings. We can (and do) provide tools to make this easier, but it's still the programmer's responsibility to use the tools correctly. I agree with this as well. In my original message, I complained that there was no documentation at all. Since we offer documentation on how to code in plpgsql, pltcl, plperl, etc., it might be nice to include something. Even if it were something brief, such as suggesting escaped quotes and other suspicious characters, it would be better than the nothing that is there presently. Like I said, it allows some disclaiming of culpability for the programmer -- I did what the docs said -- and it gives them an idea of where to start. My initial feeling is that a small addition to the 'Server Programming' section would be reasonable, or perhaps in the Appendix. I can't see why anyone would be opposed to this, however. I'm happy to write the document and provide a patch for inclusion if we can come to agreeance on some basic policies. The reason I posted the original message in this thread is I wanted to know what others felt were appropriate policies, and to suggest said policies wound up in a doc. Alex -- [EMAIL PROTECTED] Alex J. Avriette, Unix Systems Gladiator I favor the Civil Rights Act of 1965, and it must be enforced at gunpoint if necessary. - Ronald Reagan ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] RFC: Very large scale postgres support
The fact is, there are situations in which such extreme traffic is warranted. My concern is that I am not able to use postgres in such situations because it cannot scale to that level. I feel that it would be possible to reach that level with support in the postmaster for replication. Replication won't help if those are all mostly write transactions. If a small percentage, even 1% would be challenging, is INSERTS, UPDATES or DELETES, master / slave replication might get you somewhere. Otherwise you're going to need to partition the data up into smaller, easily managed sizes -- that of course requires an ability to horizontally partition the data. Anyway, if you want a sane answer we need more information about the data (is it partitionable?), schema type, queries producing the load (simple or complex), acceptable data delays (does a new insert need to be immediately visible?), etc. Dealing with a hundred thousand queries/second isn't just challenging to PostgreSQL, you will be hard pressed to find the hardware that will push that much data around even with the overhead of the database itself. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] RFC: Very large scale postgres support
On Sun, Feb 08, 2004 at 08:01:38PM -0500, Rod Taylor wrote: Replication won't help if those are all mostly write transactions. If a small percentage, even 1% would be challenging, is INSERTS, UPDATES or DELETES, master / slave replication might get you somewhere. There is no way on earth we could be doing writes at that rate. I think that's a given. Otherwise you're going to need to partition the data up into smaller, easily managed sizes -- that of course requires an ability to horizontally partition the data. Obviously, this is the route we have taken. Anyway, if you want a sane answer we need more information about the data (is it partitionable?), schema type, queries producing the load (simple or complex), acceptable data delays (does a new insert need to be immediately visible?), etc. We've considered a lot of this. Like I said, I think a lot of our need for distributing the database can be helped along with native replication. Am I hearing that nobody believes scalability is a concern? I think many of us would like to see features that would allow large scale installations to be more practical. I also think most of us would agree that the current graft-on replication methods are sub-ideal. alex -- [EMAIL PROTECTED] Alex J. Avriette, Unix Systems Gladiator The Emperor Wears No Clothes. http://www.drugsense.org/wodclock.htm ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] RFC: Very large scale postgres support
On Sun, 2004-02-08 at 21:01, Alex J. Avriette wrote: On Sun, Feb 08, 2004 at 08:01:38PM -0500, Rod Taylor wrote: Replication won't help if those are all mostly write transactions. If a small percentage, even 1% would be challenging, is INSERTS, UPDATES or DELETES, master / slave replication might get you somewhere. There is no way on earth we could be doing writes at that rate. I think that's a given. Sure you can, if you can horizontally partition the data so clients A are on machine A, clients B are on machine B, ... ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] RFC: Very large scale postgres support
replication. Am I hearing that nobody believes scalability is a concern? I think many of us would like to see features that would allow large scale installations to be more practical. I also think most of us would agree that the current graft-on replication methods are sub-ideal. You really haven't told us which approach to scaling you require. Many are available and PostgreSQL is really only good at a few of them. Anyway, current replication is a PITA mostly due to it's inability to easily start from an empty live database and catch up. You might want to throw a developer at helping SLONY along if you're feel master/slave is the right direction, since presumably you will have several slaves. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] RFC: Security documentation
Alex J. Avriette [EMAIL PROTECTED] writes: On Sun, Feb 08, 2004 at 01:33:31PM -0500, Tom Lane wrote: Actually, the extended-query message in the new FE/BE protocol works exactly that way. (Tom is referring to this: http://archives.postgresql.org/pgsql-interfaces/2003-03/msg00017.php) That's not a particularly helpful link, since it predates the whole concept of the extended query protocol. See http://www.postgresql.org/docs/7.4/static/protocol.html#PROTOCOL-QUERY-CONCEPTS http://www.postgresql.org/docs/7.4/static/protocol-flow.html#AEN52626 particularly the NOTE in the latter section. How would you suggest implementing this? Having a no subqueries setting? The app programmer could choose to use only extended queries and not simple Query messages. (If using libpq, this means only PQexecParams and never PQexec.) I agree with this as well. In my original message, I complained that there was no documentation at all. Since we offer documentation on how to code in plpgsql, pltcl, plperl, etc., it might be nice to include something. Even if it were something brief, such as suggesting escaped quotes and other suspicious characters, it would be better than the nothing that is there presently. Is this nothing? http://www.postgresql.org/docs/7.4/static/libpq-exec.html#LIBPQ-EXEC-ESCAPE-STRING I don't think the docs are nearly as bereft of security-related items as you claim. They may be scattered and poorly indexed, but they're there. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] psql tab completion USERSET vars
Neil Conway [EMAIL PROTECTED] writes: While reviewing Andrew Dunstan's patch that implements the log_disconnections GUC var, I noticed that tab-complete.c in psql claims that only USERSET and possibly SUSET GUC variables should be included in the list of variables that can be auto-completed. I think I was the one who put in that suggestion (which I agree has been largely ignored subsequently). The motivation was to avoid cluttering the tab completion facility with seldom-used parameters. As an example, the presence of the stats_xxx parameters in the tab list means more keystrokes are needed to type statement_timeout. Seems like this is a bad tradeoff considering that people would rarely bother to look at the stats settings. I'm not by any means wedded to the USERSET and possibly SUSET policy, but I would like to stop somewhere short of include everything. Any thoughts? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Transaction aborts on syntax error.
Bruce Momjian [EMAIL PROTECTED] writes: Edwin S. Ramirez wrote: Hello, Is is possible to change the transaction behaviour not to abort when a syntax error occurs. I've done some searches on the list, and have not found anything. No, we need nested transactions for that. We are working on it or at least have a plan. I'm not clear why nested transactions are necessary. Oracle certainly doesn't require dealing with nested transactions to get this, and its been a long time but I don't recall MSSQL doing anything like that either. If I recall correctly they both do this by automatically by default. I can see the rationale for aborting a transaction after a syntax error in an application where syntax errors are a sign of a problem. And I could see how nested transactions would be a good tool to deal with that. But why does the database enforce that every syntax error *requires* a transaction roll back? Shouldn't that be up to the application to decide? Perhaps the syntax error is for a known reason and the application would be fine with committing the previous changes or performing an alternate query. In interactive use in particular the application, actually the user, likely knows that the syntax error doesn't indicate any problem with the transaction at all. The user could see an error message and fix the query and repeat it himself without having to invoke any special commands to begin and roll back a nested transaction. Humans are good at things like that. I think this is as simple as an interactive or manual error rollback option that would make syntax errors not cause a transaction to fail at all. They could simply be ignored. Pretty much any database query that didn't cause any incomplete writes could be treated this way. When I used Oracle the fact that every sqlplus session was always in autocommit-off mode was oftentimes a lifesaver. I would do major database updates, then do several selects to verify that everything went as planned before committing. In postgres that's not feasible. I would have to remember before beginning to type BEGIN. Then as soon as I make a typo on one of those selects the whole update has to be rolled back and done again. Nested transactions would make it possible, but still not automatic. It would only work if I think in advance to start nested transactions, and then I would have to tediously roll back the nested transaction and start a new one for every typo. I think the typo - transaction rollback implication fails the least surprise principle. And nested transactions are a red herring. While they would be a useful tool for dealing with this situation programmatically, they shouldn't be necessary for dealing with it when a human is at the console. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Transaction aborts on syntax error.
On Mon, 8 Feb 2004, Greg Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: Edwin S. Ramirez wrote: Hello, Is is possible to change the transaction behaviour not to abort when a syntax error occurs. I've done some searches on the list, and have not found anything. No, we need nested transactions for that. We are working on it or at least have a plan. I'm not clear why nested transactions are necessary. Oracle certainly doesn't require dealing with nested transactions to get this, and its been a long time but I don't recall MSSQL doing anything like that either. If I recall correctly they both do this by automatically by default. I can see the rationale for aborting a transaction after a syntax error in an application where syntax errors are a sign of a problem. And I could see how nested transactions would be a good tool to deal with that. Its not that there's a rationale behind it. Rather, the existing error handling code *has* to abort the current transaction because an error has taken place. In a multi statement transaction block (ie, BEGIN; ...; ...; ... COMMIT;) each statement piggy backs on onto the whole transaction. Because we're aborted one query, we've aborted them all. With nested transactions, every query within a transaction block could be run within its own (sub)transaction. The backend could be jigged so that if parse errors occur, we abort the second level transaction and roll back to the start point at the moment before the error generating statement took place. This keeps the rest of the queries executed in the transaction block in place When I used Oracle the fact that every sqlplus session was always in autocommit-off mode was oftentimes a lifesaver. I would do major database updates, then do several selects to verify that everything went as planned before committing. In postgres that's not feasible. I would have to remember before beginning to type BEGIN. Then as soon as I make a typo on one of those selects the whole update has to be rolled back and done again. Nested transactions would make it possible, but still not automatic. It would only work if I think in advance to start nested transactions, and then I would have to tediously roll back the nested transaction and start a new one for every typo. In psql: \set AUTOCOMMIT off Gavin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Transaction aborts on syntax error.
Greg Stark [EMAIL PROTECTED] writes: But why does the database enforce that every syntax error *requires* a transaction roll back? PG enforces that every error requires a transaction abort. Period, full stop. Picking and choosing which errors might not really require a rollback involves a level of detailed code-behavior analysis (and consequent fragility in the face of changes) that no one has wanted to undertake. As an example: SELECT * FROM foo where foo doesn't exist will result in a 'not found' error reported from somewhere in the catalog lookup code. Fine, we probably wouldn't need a transaction abort to clean up after that. But there are a ton of error cases right next door in that same code that we'd better do an abort to clean up after --- deadlocks, corrupted data structures, who knows what. Even 'not found' is problematic if the elog-induced longjmp occurs at a point where we're holding some locks or other resources that need to be released. What it comes down to is that a lot of code in the backend assumes that transaction abort can be relied on to do any post-elog cleanup needed, such as releasing locks or reclaiming leaked memory. I don't think we can afford to give up that assumption; the costs in code complexity and instability would be horrific. What we have to do is generalize the abort cleanup code so it can handle partial rollbacks as well as complete ones. Thus nested transactions is really a shorthand for this problem of post-error cleanup. And nested transactions are a red herring. You seem to think this is being driven by user-interface issues. It's an implementation necessity. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[HACKERS] psql variables
Recently, I was surprised to learn that psql variables are case sensitive. Furthermore, there is no error when one attempts to '\set' a non-existent variable (arguably for good reason: I suppose it's too late now to get rid of user-defined psql variables). That leads to confusing situations like the following: nconway=# \set autocommit off nconway=# create table def (a int, b int); CREATE TABLE nconway=# abort; WARNING: there is no transaction in progress ROLLBACK One possible justification for this behavior is that it allows a simple naming convention to distinguish user-defined variables from psql-internal variables. Is that the case? If so, can't we achieve that aim via some other means? For example, by requiring a common prefix for psql-internal variables, akin to 'pg_*' for system catalogs prior to 7.3. While we're on the subject, there are some other design choices in this area that seem a bit unfortunate. For example, we don't error out on references to undefined variables (so \echo :x could be an undefined variable or a variable that is set to the empty string, it isn't possible to tell). Also, the user can go ahead and overwrite the value of built-in variables like HOST and PORT, which doesn't seem like a very useful feature. Now that 7.4's psql includes some genuinely useful psql variables like AUTOCOMMIT and VERBOSITY, it seems likely this feature will see more usage, so it would be nice to clean it up a bit. -Neil ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] psql variables
Neil Conway [EMAIL PROTECTED] writes: Recently, I was surprised to learn that psql variables are case sensitive. Furthermore, there is no error when one attempts to '\set' a non-existent variable (arguably for good reason: I suppose it's too late now to get rid of user-defined psql variables). [ blinks... ] This is historical revisionism. Psql variables were invented to provide user-defined variables; it is the predefined ones that are a wart added to the mechanism, not vice versa. You could argue it either way about whether their names should be case-sensitive or not. I think it's too late to revisit that choice given that it's not 100% wrong. While we're on the subject, there are some other design choices in this area that seem a bit unfortunate. For example, we don't error out on references to undefined variables (so \echo :x could be an undefined variable or a variable that is set to the empty string, it isn't possible to tell). That one bothers me a bit too; perhaps Peter can give a justification. Also, the user can go ahead and overwrite the value of built-in variables like HOST and PORT, which doesn't seem like a very useful feature. Again, you're letting the tail wag the dog. If we did that then any addition of a new built-in variable would risk breaking existing user scripts that happened to use that name as an ordinary variable. It wouldn't be a bad idea to document some restriction on which names might become predefined variables in future. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] RFC: Very large scale postgres support
On Sun, Feb 08, 2004 at 09:20:07PM -0500, Rod Taylor wrote: On Sun, 2004-02-08 at 21:01, Alex J. Avriette wrote: On Sun, Feb 08, 2004 at 08:01:38PM -0500, Rod Taylor wrote: Replication won't help if those are all mostly write transactions. If a small percentage, even 1% would be challenging, is INSERTS, UPDATES or DELETES, master / slave replication might get you somewhere. There is no way on earth we could be doing writes at that rate. I think that's a given. Sure you can, if you can horizontally partition the data so clients A are on machine A, clients B are on machine B, ... I think you were assuming inserts here. The problem actually comes from updates here. The problem is, if I update here, how long before the rest of my cloud of postgres nodes understand that record is updated? With an insert, the transaction and propagation are fairly clear. With an update, the overall cost is higher, and the cost per-node is higher. Alex -- [EMAIL PROTECTED] Alex J. Avriette, Unix Systems Gladiator You cannot invade the mainland United States. There would be a rifle behind each blade of grass. - Admiral Isoroku Yamamoto ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] [HACKERS] dollar quoting
Andrew Dunstan [EMAIL PROTECTED] writes: Comments welcome. Reviewers: I am not sure I got multi-byte stuff right in psql/mainloop.c - please pay close attention to that. The i-1 stuff should generally be i-prevlen. Not sure if there are any other pitfalls. A bigger problem here: + else if (!dol_quote line[i] == '$' + !isdigit(line[i + thislen]) + (dol_end = strchr(line+i+1,'$')) != NULL + (i == 0 || + ! ((line[i-1] 0x80) != 0 || isalnum(line[i-1]) || + line[i-1] == '_'))) + { is that you aren't checking that what comes between the two dollar signs looks like empty-or-an-identifier. The check for next-char-isn't-a-digit is part of that but not the only part. Also I'm not sure about the positioning of these tests relative to the in_quote and in_xcomment tests. As you have it, $foo$ will be recognized within an xcomment, which I think is at variance with the proposed backend lexing behavior. Also, the strdup should be pg_strdup. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Linking the previously separated documentation
Hello all! I've been eyeing providing links between the previously separated documentation books, e.g., linking this sentence Please familiarize yourself with the pg_dump reference page. in the backup and restore section http://www.postgresql.org/docs/current/static/backup.html to the applicable reference page. I haven't gone through all of the docs yet (there are a lot of them!) and right now I'm just looking for obvious linking possibilities of the see foo variety. Is anyone else working on something like this? Is it worthwhile? If so, I've got a question as to style. My first idea was not to change the text at all, and just replace (in the above example) pg_dump with xref linkend=APP-PGDUMP. Should I be rewriting these sections or is what I'm doing agreeable? (Might not be a bad idea to work up a style book for the documentation to guide writing, but that's a project in and of itself.) If everything seems fine, I'll see if I can't figure out how to submit a patch. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Linking the previously separated documentation
Michael Glaesemann [EMAIL PROTECTED] writes: I've been eyeing providing links between the previously separated documentation books, ... Is anyone else working on something like this? Is it worthwhile? There's been talk of this earlier, but I don't recall anyone specifically saying they'd tackle it. It's definitely worth doing. If so, I've got a question as to style. My first idea was not to change the text at all, and just replace (in the above example) pg_dump with xref linkend=APP-PGDUMP. Should I be rewriting these sections or is what I'm doing agreeable? There are (or at one time were) references along the line of see the pg_dump page in the linkreference manual/. These obviously could do with rephrasing now, if you find any left. As far as style goes, try to keep in mind that the link only helps for HTML-formatted output, and we do still try to support printing the documentation on dead trees. The reference should read well when the link infrastructure isn't there. I think this means you want to have ... see the linkpg_dump/ reference page ... and not just ... see linkpg_dump/ ... except where the context is pretty obvious, such as a SEE ALSO section of another reference page. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Linking the previously separated documentation
On Feb 9, 2004, at 2:16 PM, Tom Lane wrote: Michael Glaesemann [EMAIL PROTECTED] writes: I've been eyeing providing links between the previously separated documentation books, ... Is anyone else working on something like this? Is it worthwhile? There's been talk of this earlier, but I don't recall anyone specifically saying they'd tackle it. It's definitely worth doing. Okay. I'll keep working. If so, I've got a question as to style. My first idea was not to change the text at all, and just replace (in the above example) pg_dump with xref linkend=APP-PGDUMP. Should I be rewriting these sections or is what I'm doing agreeable? There are (or at one time were) references along the line of see the pg_dump page in the linkreference manual/. These obviously could do with rephrasing now, if you find any left. As far as style goes, try to keep in mind that the link only helps for HTML-formatted output, and we do still try to support printing the documentation on dead trees. The reference should read well when the link infrastructure isn't there. I think this means you want to have ... see the linkpg_dump/ reference page ... and not just ... see linkpg_dump/ ... except where the context is pretty obvious, such as a SEE ALSO section of another reference page. If I'm understanding you correctly, that's what I'm doing. Here's an example of the change: Original: Please familiarize yourself with the citerefentryrefentrytitlepg_dump/ reference page. Revised: Please familiarize yourself with the citerefentryrefentrytitlexref linkend=APP-PGDUMP// reference page. Doing it this way makes for quicker changes and few disruptions of any output flow, I believe. And it gets things linked. Rewriting could be worried about later. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org