[HACKERS] pg_dump & 7.0
>> >>Add a "FROM " after the "ON " to >>the CREATE CONSTRAINT TRIGGER statements. That's it. >> > >I'll make the change ASAP. > I'm about to do this - does anyone object to me adding the 7.0 backward compatibility changes at the same time? Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 0500 83 82 82 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: refusing connections based on load ...
On Tue, Apr 24, 2001 at 12:39:29PM +0800, Lincoln Yeoh wrote: > At 03:09 PM 23-04-2001 -0300, you wrote: > >Basically, if great to set max clients to 256, but if load hits 50 > >as a result, the database is near to useless ... if you set it to 256, > >and 254 idle connections are going, load won't rise much, so is safe, > >but if half of those processes are active, it hurts ... > > Sorry, but I still don't understand the reasons why one would want to do > this. Could someone explain? > > I'm thinking that if I allow 256 clients, and my hardware/OS bogs down > when 60 users are doing lots of queries, I either accept that, or > figure that my hardware/OS actually can't cope with that many clients > and reduce the max clients or upgrade the hardware (or maybe do a > little tweaking here and there). > > Why not be more deterministic about refusing connections and stick > to reducing max clients? If not it seems like a case where you're > promised something but when you need it, you can't have it. The point is that "number of connections" is a very poor estimate of system load. Sometimes a connection is busy, sometimes it's not. Some connections are busy, some are not. The goal is maximum throughput or some tradeoff of maximum throughput against latency. If system throughput varies nonlinearly with load (as it almost always does) then this happens at some particular load level. Refusing a connection and letting the client try again later can be a way to maximize throughput by keeping the system at the optimum point. (Waiting reduces delay. Yes, this is counterintuitive, but why do we queue up at ticket windows?) Delaying response, when under excessive load, to clients who already have a connection -- even if they just got one -- can have a similar effect, but with finer granularity and with less complexity in the clients. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] ERROR: parser: parse error at or near "JOIN"
Is anyone else seeing this? I have the current CVS sources and "make check" ends up with one failure. My regression.diffs shows: *** ./expected/join.out Thu Dec 14 17:30:45 2000 --- ./results/join.out Mon Apr 23 20:23:15 2001 *** *** 1845,1851 -- UNION JOIN isn't implemented yet SELECT '' AS "xxx", * FROM J1_TBL UNION JOIN J2_TBL; ! ERROR: UNION JOIN is not implemented yet -- -- Clean up -- --- 1845,1851 -- UNION JOIN isn't implemented yet SELECT '' AS "xxx", * FROM J1_TBL UNION JOIN J2_TBL; ! ERROR: parser: parse error at or near "JOIN" -- -- Clean up -- == -- Fernando Nasser Red Hat Canada Ltd. E-Mail: [EMAIL PROTECTED] 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] refusing connections based on load ...
On Mon, Apr 23, 2001 at 10:50:42PM -0400, Tom Lane wrote: > Basically, if we do this then we are abandoning the notion that Postgres > runs as an unprivileged user. I think that's a BAD idea, especially in > an environment that's open enough that you might feel the need to > load-throttle your users. By definition you do not trust them, eh? No. It's not a case of trust, but of providing an adaptive way to keep performance reasonable. The users may have no independent way to cooperate to limit load, but the DB can provide that. > A less dangerous way of approaching it might be to have an option > whereby the postmaster invokes 'uptime' via system() every so often > (maybe once a minute?) and throttles on the basis of the results. > The reaction time would be poorer, but security would be a whole lot > better. Yes, this alternative looks much better to me. On Linux you have the much more efficient alternative, /proc/loadavg. (I wouldn't use system(), though.) Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] start / stop scripts question
Hi, I believe i found two minor bugs in the linux start/stop scripts for the downloadable rpm version of postgres 7.1. I don't think these have been reported already (i did some quik searches). Please look these over and see if i'm just smoking something or if these bugs are valid. Also, i did a quick cvs checkout / log of the contrib tree, and i noted that the start/stop scripts have been restructured recently (i do not know where logic of the scripts were moved to, so these points may still be valid, if not, i was wondering if I pull the scripts from the cvs contrib tree myself, would they work out of the box?). --- #1. Every instance of (there are 2): pid=`pidof postmaster` if [ $pid ] should be: pid=`pidof -s postmaster` if [ $pid ] (pidof may return multiple pids if postmaster forked or has multiple threads -- i'm not toofamiliar with postgres architecture, but postmaster does sometimes show multiple pids which could mean multiple threads or processes in linux) If pidof returns multiple pids, the "if" will barf giving something like the following: Stopping postgresql service: [ OK ] Checking postgresql installation: [ OK ] /etc/rc.d/init.d/postgresql: [: 1223: unary operator expected Starting postgresql service: [FAILED] #2. /etc/rc.d/init.d/postgresql restart sometimes doesn't do what it should. ie. end up with a fresh newly started postgres daemon. This happens because the rc.d script does something very simple: stop; start. This is correct, but stop doesn't do what it should. When stop returns, postgres may not have fully stopped for some reason. start complains that postmaster is still running. After doing some testing, my hypothesis is this (i have no idea how postgres works intermally): 1. I run a bunch of inserts, create tables 2. I call postgres stop 3. one of the postgres "processes" stops. 4. the other processes are still trying to flush stuff onto the disk before they quit. 5. start is called, and it finds some "postmaster" processes, and thus says "postmaster is running". 6. the other processes finally are done and stop. Now there are no more postgres running. When i added a sleep 10 between stop / start, everything was fine. The "correct" solution would be for postgres stop to actually wait for the entire db to exit cleanly. BTW, i uncovered this via an automated install / configuration / population of a postgress database which involves a restart right after population of a database. Thanx. -rchit ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] refusing connections based on load ...
Tom Lane <[EMAIL PROTECTED]> writes: > > Rather than do system('uptime') and incur the process start-up each time, > > you could do fp = popen('vmstat 60', 'r'), then just read the fp. > > popen doesn't incur a process start? Get real. But you're right, popen() > is the right call not system(), because you need to read the stdout. Tom, I think the point here is that the 'vmstat' process, once started, will keep printing status output every 60 seconds (if invoked as above) so you don't have to restart it every minute, just read the pipe. > > I believe vmstat is fairly standard. > > Not more so than uptime --- and the latter's output format is definitely > less variable across platforms. The HPUX man page goes so far as to say > > WARNINGS > Users of vmstat must not rely on the exact field widths and spacing of > its output, as these will vary depending on the system, the release of > HP-UX, and the data to be displayed. > > and that's just for *one* platform. A very valid objection. I'm also dubious as to the utility of the whole concept. What happens when Sendmail refuses a message based on load? It is requeued on the sending end to be tried later. What happens when PG refuses a new client connection based on load? The application stops working. Is this really better than having slow response time because the server is thrashing? I guess my point is that Sendmail is a store-and-forward situation where the mail system can "catch up" once the load returns to normal. Whereas, I would think, the majority of PG installations want a working database, and whether it's refusing connections due to load or simply bogged down isn't going to make a difference to users that can't get their data. -Doug -- The rain man gave me two cures; he said jump right in, The first was Texas medicine--the second was just railroad gin, And like a fool I mixed them, and it strangled up my mind, Now people just get uglier, and I got no sense of time... --Dylan ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Re: refusing connections based on load ...
At 03:09 PM 23-04-2001 -0300, you wrote: > >Anyone thought of implementing this, similar to how sendmail does it? If >load > n, refuse connections? > >Basically, if great to set max clients to 256, but if load hits 50 as a >result, the database is near to useless ... if you set it to 256, and 254 >idle connections are going, load won't rise much, so is safe, but if half >of those processes are active, it hurts ... Sorry, but I still don't understand the reasons why one would want to do this. Could someone explain? I'm thinking that if I allow 256 clients, and my hardware/OS bogs down when 60 users are doing lots of queries, I either accept that, or figure that my hardware/OS actually can't cope with that many clients and reduce the max clients or upgrade the hardware (or maybe do a little tweaking here and there). Why not be more deterministic about refusing connections and stick to reducing max clients? If not it seems like a case where you're promised something but when you need it, you can't have it. Cheerio, Link. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] refusing connections based on load ...
On 23 Apr 2001, Ian Lance Taylor wrote: > Tom Lane <[EMAIL PROTECTED]> writes: > > > On Linux and BSD it seems to be more common to put /dev/kmem into a > > specialized group "kmem", so running postgres as setgid kmem is not so > > immediately dangerous. Still, do you think it's a good idea to let an > > attacker have open-ended rights to read your kernel memory? It wouldn't > > take too much effort to sniff passwords, for example. > > On Linux you can get the load average by doing `cat /proc/loadavg'. > On NetBSD you can get the load average via a sysctl. On those systems > and others the uptime program is neither setuid nor setgid. Good call ... FreeBSD has it also, and needs no special privileges ... just checked, and the sysctl command isn't setuid/setgid anything, so I'm guessing that using sysctl() to pull these values shouldn't create any security issues on those systems that support it ? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] refusing connections based on load ...
other then a potential buffer overrun, what would be the problem with: open(kmem) read values close(kmem) ? I would think it would be less taxing to the system then doing a system() call, but still effectively as safe, no? On Mon, 23 Apr 2001, Tom Lane wrote: > The Hermit Hacker <[EMAIL PROTECTED]> writes: > > On Mon, 23 Apr 2001, Tom Lane wrote: > >> sendmail expects to be root. > > > Actually, not totally accurate ... sendmail has a 'RunAs' option for those > > that don't wish to have it run as root, > > True, it doesn't *have* to be root, but the loadavg code still requires > privileges beyond those of mere mortals (as does listening on port 25, > last I checked). > > On my HPUX box: > > $ ls -l /dev/kmem > crw-r- 1 binsys 3 0x01 Jun 10 1996 /dev/kmem > > so postgres would have to run setuid bin or setgid sys to read the load > average. Either one is equivalent to giving an attacker the keys to the > kingdom (overwrite a few key /usr/bin/ executables and wait for root to > run one...) > > On Linux and BSD it seems to be more common to put /dev/kmem into a > specialized group "kmem", so running postgres as setgid kmem is not so > immediately dangerous. Still, do you think it's a good idea to let an > attacker have open-ended rights to read your kernel memory? It wouldn't > take too much effort to sniff passwords, for example. > > Basically, if we do this then we are abandoning the notion that Postgres > runs as an unprivileged user. I think that's a BAD idea, especially in > an environment that's open enough that you might feel the need to > load-throttle your users. By definition you do not trust them, eh? > > A less dangerous way of approaching it might be to have an option > whereby the postmaster invokes 'uptime' via system() every so often > (maybe once a minute?) and throttles on the basis of the results. > The reaction time would be poorer, but security would be a whole lot > better. > > regards, tom lane > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: [EMAIL PROTECTED] secondary: scrappy@{freebsd|postgresql}.org ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] refusing connections based on load ...
Tom Lane <[EMAIL PROTECTED]> writes: > On Linux and BSD it seems to be more common to put /dev/kmem into a > specialized group "kmem", so running postgres as setgid kmem is not so > immediately dangerous. Still, do you think it's a good idea to let an > attacker have open-ended rights to read your kernel memory? It wouldn't > take too much effort to sniff passwords, for example. On Linux you can get the load average by doing `cat /proc/loadavg'. On NetBSD you can get the load average via a sysctl. On those systems and others the uptime program is neither setuid nor setgid. > A less dangerous way of approaching it might be to have an option > whereby the postmaster invokes 'uptime' via system() every so often > (maybe once a minute?) and throttles on the basis of the results. > The reaction time would be poorer, but security would be a whole lot > better. That is the way to do it on systems where obtaining the load average requires special privileges. But do you really need the load average once a minute? The load average printed by uptime is just as accurate as the load average obtained by examining the kernel. Ian ---(end of broadcast)--- TIP 652: Life is a serious burden, which no thinking, humane person would wantonly inflict on someone else. -- Clarence Darrow ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] refusing connections based on load ...
> Rather than do system('uptime') and incur the process start-up each time, > you could do fp = popen('vmstat 60', 'r'), then just read the fp. popen doesn't incur a process start? Get real. But you're right, popen() is the right call not system(), because you need to read the stdout. > I believe vmstat is fairly standard. Not more so than uptime --- and the latter's output format is definitely less variable across platforms. The HPUX man page goes so far as to say WARNINGS Users of vmstat must not rely on the exact field widths and spacing of its output, as these will vary depending on the system, the release of HP-UX, and the data to be displayed. and that's just for *one* platform. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] refusing connections based on load ...
The Hermit Hacker <[EMAIL PROTECTED]> writes: > On Mon, 23 Apr 2001, Tom Lane wrote: >> sendmail expects to be root. > Actually, not totally accurate ... sendmail has a 'RunAs' option for those > that don't wish to have it run as root, True, it doesn't *have* to be root, but the loadavg code still requires privileges beyond those of mere mortals (as does listening on port 25, last I checked). On my HPUX box: $ ls -l /dev/kmem crw-r- 1 binsys 3 0x01 Jun 10 1996 /dev/kmem so postgres would have to run setuid bin or setgid sys to read the load average. Either one is equivalent to giving an attacker the keys to the kingdom (overwrite a few key /usr/bin/ executables and wait for root to run one...) On Linux and BSD it seems to be more common to put /dev/kmem into a specialized group "kmem", so running postgres as setgid kmem is not so immediately dangerous. Still, do you think it's a good idea to let an attacker have open-ended rights to read your kernel memory? It wouldn't take too much effort to sniff passwords, for example. Basically, if we do this then we are abandoning the notion that Postgres runs as an unprivileged user. I think that's a BAD idea, especially in an environment that's open enough that you might feel the need to load-throttle your users. By definition you do not trust them, eh? A less dangerous way of approaching it might be to have an option whereby the postmaster invokes 'uptime' via system() every so often (maybe once a minute?) and throttles on the basis of the results. The reaction time would be poorer, but security would be a whole lot better. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] refusing connections based on load ...
On Mon, 23 Apr 2001, Tom Lane wrote: > The Hermit Hacker <[EMAIL PROTECTED]> writes: > > sendmail does it now, and, apparently relatively portable across OSs ... > > sendmail expects to be root. It's unlikely (and very undesirable) that > postgres will be installed with adequate privileges to read /dev/kmem, > which is what it'd take to run the sendmail loadaverage code on most > platforms... Actually, not totally accurate ... sendmail has a 'RunAs' option for those that don't wish to have it run as root, and still works for the loadavg stuff, to the best of my knowledge (its an option I haven't played with yet) ... ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Re: Hardcopy docs available
On Tue, 24 Apr 2001, Thomas Lockhart wrote: > > > ... if there is interest in an A4 layout of the docs, let me know... > > I've gotten several requests for the A4 format, and have completed four > > of the six docs in that format. Thanks for the feedback. They should be > > available in the next couple of days... > > OK, A4 docs are now posted on the web site and the ftp site. Also, I've > put copies of the html tarballs on the ftp site, so there should now be > tarballs, two kinds of postscript, and PDFs available there. > > If someone wants to run the A4 docs through a PDF converter, send 'em to > me and I'll post them too. Tom, ps2pdf is on hub. Vince. -- == Vince Vielhaber -- KA8CSHemail: [EMAIL PROTECTED]http://www.pop4.net 56K Nationwide Dialup from $16.00/mo at Pop4 Networking Online Campground Directoryhttp://www.camping-usa.com Online Giftshop Superstorehttp://www.cloudninegifts.com == ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] refusing connections based on load ...
The Hermit Hacker <[EMAIL PROTECTED]> writes: > sendmail does it now, and, apparently relatively portable across OSs ... sendmail expects to be root. It's unlikely (and very undesirable) that postgres will be installed with adequate privileges to read /dev/kmem, which is what it'd take to run the sendmail loadaverage code on most platforms... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] concurrent Postgres on NUMA - howto ?
"Mauricio Breternitz" <[EMAIL PROTECTED]> writes: > My concern is whether that is enough to maintain consistency > in the buffer cache No, it isn't --- for one thing, WriteBuffer wouldn't cause other backends to update their copies of the page. At the very least you'd need to synchronize where the LockBuffer calls are, not where WriteBuffer is called. I really question whether you want to do anything like this at all. Seems like accessing the shared buffers right where they are will be fastest; your approach will entail a huge amount of extra data copying. Considering that a backend doesn't normally touch every byte on a page that it accesses, I wouldn't be surprised if full-page copying would net out to being more shared-memory traffic, rather than less. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: SET SESSION AUTHORIZATION (was Re: [HACKERS] Real/effectiveuser)
Karel Zak writes: > Great! With this feature is possible use persisten connection and > on-the-fly changing actual user, right? It's very useful for example > web application that checking user privilege via SQL layout. A real persistent connection solution would require real session management, especially the ability to reset configuration options made during the previous session. > (connected as superuser) > > set session authorization 'userA'; > set session authorization 'userB'; > > IMHO it must be disable, right must be something like: > > set session authorization 'userA'; > unset session authorization; <-- switch back to superuser > set session authorization 'userB'; You can't "unset" the session user, there must always be one because there is nothing below it. > ..like as on Linux: > > # su - zakkr > $ id -u > 1000 > $ su - jmarek > Password: > su: Authentication failure > Sorry. The difference here is that 'su' also starts a new session but set session authorization changes the state of the current session. So 'su' is similar to START SESSION; -- Don't know if this is the syntax. SET SESSION AUTHORIZATION 'xxx'; all in one command. When and if we get real session management we will probably have the ability to revert user identity changes like you probably imagine. -- Peter Eisentraut [EMAIL PROTECTED] http://funkturm.homeip.net/~peter ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] refusing connections based on load ...
Nathan Myers wrote: > On Mon, Apr 23, 2001 at 03:09:53PM -0300, The Hermit Hacker wrote: > > > > Anyone thought of implementing this, similar to how sendmail does it? If > > load > n, refuse connections? > > ... > > If nobody is working on something like this, does anyone but me feel that > > it has merit to make use of? I'll play with it if so ... > > I agree that it would be useful. Even more useful would be soft load > shedding, where once some load average level is exceeded the postmaster > delays a bit (proportionately) before accepting a connection. Or have the load check on AtXactStart, and delay new transactions until load is back below x, where x is configurable per user/group plus some per database scaling factor. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] RI oddness
Hi, I just got trapped by one of my own features in the referential integrity area. The problem is, that the trigger run on the FK row at UPDATE allways checks and locks the referenced PK, even if the FK attributes didn't change. That's because if there'd be an ON DELETE SET DEFAULTS and someone deletes a PK consisting of all the FK's column defaults, we wouldn't notice and let it pass through. The bad thing on it is now, if I have one XACT that locks the PK row first, then locks the FK row, and I have another XACT that just want's to update another field in the FK row, that second XACT must lock the PK row in the first place or this entire thing leads to deadlocks. If one table has alot of FK constraints, this causes not really wanted lock contention. The clean way to get out of it would be to skip non-FK-change events in the UPDATE trigger and do alot of extra work in the SET DEFAULTS trigger. Actually it'd be to check if we're actually deleting the FK defaults values from the PK table, and if so we'd have to check if references exist by doing another NO ACTION kinda test. Any other smart idea? Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] refusing connections based on load ...
On Mon, Apr 23, 2001 at 03:09:53PM -0300, The Hermit Hacker wrote: > > Anyone thought of implementing this, similar to how sendmail does it? If > load > n, refuse connections? > ... > If nobody is working on something like this, does anyone but me feel that > it has merit to make use of? I'll play with it if so ... I agree that it would be useful. Even more useful would be soft load shedding, where once some load average level is exceeded the postmaster delays a bit (proportionately) before accepting a connection. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] refusing connections based on load ...
Anyone thought of implementing this, similar to how sendmail does it? If load > n, refuse connections? Basically, if great to set max clients to 256, but if load hits 50 as a result, the database is near to useless ... if you set it to 256, and 254 idle connections are going, load won't rise much, so is safe, but if half of those processes are active, it hurts ... so, if it was set so that a .conf variable could be set so that max connection == 256 *or* load > n to refuse connections, you'd hvae best of both worlds ... sendmail does it now, and, apparently relatively portable across OSs ... okay, just looked at the code, and its kinda painful, but its in src/conf.c, as a 'getla' function ... If nobody is working on something like this, does anyone but me feel that it has merit to make use of? I'll play with it if so ... ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] RFC: planner statistics in 7.2
Bruce Momjian wrote: > > A different approach that's been discussed on pghackers is to make use > > of btree indexes for columns that have such indexes: we could scan the > > indexes to visit all the column values in sorted order. I have rejected > > that approach because (a) it doesn't help for columns without a suitable > > index; (b) our indexes don't distinguish deleted and live tuples, > > which would skew the statistics --- in particular, we couldn't tell a > > frequently-updated single tuple from a commonly repeated value; (c) > > scanning multiple indexes would likely require more total I/O than just > > grabbing sample tuples from the main table --- especially if we have to > > do that anyway to handle columns without indexes. > > Remember one idea is for index scans to automatically update the expired > flag in the index bitfields when they check the heap tuple. And we should really do that. While playing around with my (for 7.2 to be) access statistics stuff I found that when running pg_bench, a couple of thousand index scans cause millions and millions of buffer fetches, because that pg_bench updates one and the same row over and over again and it has a PKEY. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] row name length
Just a question, where is NAMEDATALEN now in 7.1, I didn't see it in postgres.h. If this is no longer used to change column name length, what is? Your help is appreciated. - Adam Rose ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Look what book I found
Postgresql Programmer's Guide by Thomas Lockhart, Thomas Lochart (Editor) http://www.amazon.com/exec/obidos/ASIN/0595149170/ref=pd_sim_elt_l1/107-6921356-0996510 -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] book I found
Actually, the text I quoted was wrong. It was from the Amazon web page. The book cover says: Postgresql Programmer's Guide by The PostgreSQL Development Team Edited by Thomas Lochart Also, somone reviewed my book at: http://Linuxiso.org/bookreviews/postgresql.html This is how I found about about this other book. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] concurrent Postgres on NUMA - howto ?
Folks: I'm planning a port of Postgres to a multiprocessor architecture in which all nodes have both local memory and fast access to a shared memory. Shared memory it more expensive than local memory. My intent is to put the shmem & lock structures in shared memory, but use a copy-in / copy-out approach to maintain coherence in the buffer cache: - copy buffer from shared memroy on buffer allocate - write back buffer to shared memory when it is dirtied. Is that enough ? The idea sketch is as follows (mostly, changes contained to storage/buffer/bufmgr.c): -change BufferAlloc, etc, to create a node-local copy of the buffer (from shared memory). Copy both the BufferDesc entry and the buffer->data array -change WriteBuffer to copy the (locally changed) buffer to shared memory (this is the point in which the BM_DIRTY bit is set). [ I am assuming the buffer is locked & this is a safe time to make the buffer visible to other backends]. [Assume, for this discussion, that the sem / locks structs in shared memory have been ported & work ]. Ditto for the hash access. My concern is whether that is enough to maintain consistency in the buffer cache (i.e, are there other places in the code where a backend might have a leftover pointer to somewhere in the buffer cache ? ) Because, in the scheme above, the buffer cache is not directly accessible to the backend except via this copy in / copy -out approach. [BTW, I think this might be a way of providing a 'cluster' version of Postgers, by using some global communication module to obtain/post the 'buffer cache' values] thanks regards Mauricio [EMAIL PROTECTED] _ Get your FREE download of MSN Explorer at http://explorer.msn.com ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] row name length
Thanks for you help On Mon, 23 Apr 2001, Peter Eisentraut wrote: > Adam Rose writes: > > > I noticed in the documentation that row length is unlimited. I think I > > took that to mean row name length is now unlimited. But, row > > You took that wrong... > > > name length still appears to be set to a static width. Do I still need to > > recompile postgres to get 64 character row headers? > > Yes. > > -- Adam Rose ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] RFC: planner statistics in 7.2
At 10:10 23/04/01 -0400, Tom Lane wrote: >>> All that we're discussing here is one specific parameter in the cost >>> estimation for an indexscan, viz, the extent to which the table ordering >>> agrees with the index ordering. > >> This does not necessarily follow. A table ordering need not follow the sort >> order of an index for the index to have a low indexscan cost. All that is >> required is that most of the rows referred to by an index node must reside >> in a page or pages that will be read by one IO. eg. a table that has a >> sequence based ID, with, say 20% of rows updated, will work nicely with an >> indexscan on the ID, even though it has never been clustered. > >Right, what matters is the extent of correlation between table ordering >and index ordering, not how it got to be that way. No; *local* ordering needs to *roughly* match. Global ordering and record-by-record ordering don't matter. For example, for a table with an ID field, the rows may be stored as (where --- indicates a mythical page) - 5 9 6 7 - 1 10 2 3 - 4 8 11 12 - A sorted index may have nodes pointers to (1,2,3), (4,5,6), (7,8,9) and (10,11,12). The first node would take 1 IO, then each of the others would take 2. This would give a much more reasonable estimate for the indexscan costs (assuming a random sample is adequate). >> What I'm suggesting is that if you look at a random sample of index nodes, >> you should be able to get a statistically valid estimate of the 'clumping' >> of the data pointed to by the index. > >And I'm saying that you don't actually have to look at the index in >order to compute the very same estimate. No. Not given the above. >The only property of the index >that matters is its sort order; if you assume you know the right sort >order (and in practice there's usually only one interesting possibility >for a column) then you can compute the correlation just by looking at >the table. This is true, but only if you are strictly interested in sort order, which I don't think we are. >Andreas correctly points out that this approach doesn't extend very well >to multi-column or functional indexes, but I'm willing to punt on those >for the time being ... My approach should work with arbitrary indexes. Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 0500 83 82 82 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] RFC: planner statistics in 7.2
Tom Lane wrote: > Philip Warner <[EMAIL PROTECTED]> writes: > >> What I'm suggesting is that if you look at a random sample of index nodes, >> you should be able to get a statistically valid estimate of the 'clumping' >> of the data pointed to by the index. > > > And I'm saying that you don't actually have to look at the index in > order to compute the very same estimate. The only property of the index > that matters is its sort order; if you assume you know the right sort > order (and in practice there's usually only one interesting possibility > for a column) then you can compute the correlation just by looking at > the table. This is more true for unique indexes than for non-unique ones unless our non-unique indexes are smart enough to insert equal index nodes in table order . > Andreas correctly points out that this approach doesn't extend very well > to multi-column or functional indexes, but I'm willing to punt on those > for the time being ... -- Hannu ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] row name length
Adam Rose writes: > I noticed in the documentation that row length is unlimited. I think I > took that to mean row name length is now unlimited. But, row You took that wrong... > name length still appears to be set to a static width. Do I still need to > recompile postgres to get 64 character row headers? Yes. -- Peter Eisentraut [EMAIL PROTECTED] http://funkturm.homeip.net/~peter ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Re: How to determine if a user exists..
Dominic J. Eidson writes: > On Mon, 23 Apr 2001, Joel Burton wrote: > > > pg_user holds users > > > > (passwords in pg_shadow) > > I doubt the -hackers people would let me add SPI_* stuff into libpq, just > to retrieve whether a user exists or not.. You wouldn't have to do that. There are better ways to read system tables in the backend. See FAQ_DEV. > My first thought was to check > the existance of users against $PGDATA/pg_pwd... One question I'd have > there, is whether pg_pwd always exists (or, can be relied upon existing.)? No it doesn't and no you can't. The best way to verify a user's existence in the context of a new authentication method is to not do that at all. None of the other methods do it, the existence of a user is checked when authentication has completed and the backend starts. -- Peter Eisentraut [EMAIL PROTECTED] http://funkturm.homeip.net/~peter ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] RFC: planner statistics in 7.2
Philip Warner <[EMAIL PROTECTED]> writes: >> All that we're discussing here is one specific parameter in the cost >> estimation for an indexscan, viz, the extent to which the table ordering >> agrees with the index ordering. > This does not necessarily follow. A table ordering need not follow the sort > order of an index for the index to have a low indexscan cost. All that is > required is that most of the rows referred to by an index node must reside > in a page or pages that will be read by one IO. eg. a table that has a > sequence based ID, with, say 20% of rows updated, will work nicely with an > indexscan on the ID, even though it has never been clustered. Right, what matters is the extent of correlation between table ordering and index ordering, not how it got to be that way. > What I'm suggesting is that if you look at a random sample of index nodes, > you should be able to get a statistically valid estimate of the 'clumping' > of the data pointed to by the index. And I'm saying that you don't actually have to look at the index in order to compute the very same estimate. The only property of the index that matters is its sort order; if you assume you know the right sort order (and in practice there's usually only one interesting possibility for a column) then you can compute the correlation just by looking at the table. Andreas correctly points out that this approach doesn't extend very well to multi-column or functional indexes, but I'm willing to punt on those for the time being ... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] row name length
I noticed in the documentation that row length is unlimited. I think I took that to mean row name length is now unlimited. But, row name length still appears to be set to a static width. Do I still need to recompile postgres to get 64 character row headers? Postgres 7.1 RPMS Redhat 6.2 Help is always appreciated -- Adam Rose ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] How to determine if a user exists..
"Dominic J. Eidson" <[EMAIL PROTECTED]> writes: > I am trying to add another authentication mechanism to PostgreSQL... And, > in doing that, I need to verify the existance of an user within PG. Short > of hacking together code from verify_password(), is there any way to check > if a user exists in postgresql? If you're trying to do this from the postmaster, I think the only way is to look at $PGDATA/global/pg_pwd, which is a flat-file version of pg_shadow. You'd be well advised to study the existing verification mechanisms in src/backend/libpq/. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] How to determine if a user exists..
On Mon, 23 Apr 2001, Tom Lane wrote: > If you're trying to do this from the postmaster, I think the only way is > to look at $PGDATA/global/pg_pwd, which is a flat-file version of > pg_shadow. This is what I thought - thanks. -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli --- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Re: Replication through WAL
> Bruce Momjian wrote: > Okay, would it be helpful if I made a few suggestions on things that I > as a user/tool developer of postgres might find helpful? Not sure. I recommend hanging around, and when the discussion starts, you can add things. > > > > > > Is there an easy way to read the WAL files generated by Postgres? I'm > > > looking into writting a replication deamon for postgres and think that > > > the WAL files are the best way to know what has happened to the db and > > > what has to be replicated. I have a roughed out idea of how to code it > > > up but the only thing I'm really wrestling with is how read the WAL > > > files. Can anyone give me some pointers. I've looked at all the xlog* > > > code and have a basic understading of how it works, but not a real good > > > one. Any help on this would be appreciated. Thanks > > > > Many believe WAL will be the basis for further replication features. > > We will discuss this as part of 7.2 development in a few weeks. > > > > -- > > Bruce Momjian| http://candle.pha.pa.us > > [EMAIL PROTECTED] | (610) 853-3000 > > + If your life is a hard drive, | 830 Blythe Avenue > > + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 > > > > ---(end of broadcast)--- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Will something bad happen if I put index on xmin ?
Hannu Krosing <[EMAIL PROTECTED]> writes: > 1. Will something bad happen if I put index on xmin ? I was just testing that sort of thing yesterday. pg_dump prior to yesterday's patch will crash upon seeing such an index, but that was the only problem I found. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] pg_statistic
Tatsuo Ishii <[EMAIL PROTECTED]> writes: > I have a question about pg_statistic: Can we safely remove all records > from pg_statistic? Sure. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Re: How to determine if a user exists..
On Mon, 23 Apr 2001, Joel Burton wrote: > pg_user holds users > > (passwords in pg_shadow) I doubt the -hackers people would let me add SPI_* stuff into libpq, just to retrieve whether a user exists or not.. My first thought was to check the existance of users against $PGDATA/pg_pwd... One question I'd have there, is whether pg_pwd always exists (or, can be relied upon existing.)? -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli --- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] 7.2 feature request
What I'd like to see in 7.2 is a WAL API with the following functionality: * Get the latest transaction in the WAL * Get transaction, transId, from the WAL * Was a given transaction rolled back? What I don't want to have to worry about is all the internals needed for writting the log. I shouldn't need to know if a new seagment was created, when and where checkpoints are made, etc. Perhaps there could be a small library for just reading stuff the WAL file? The second suggestion I have is one I alreay alluded to and would offer to help work on. Namely replication for Postgres. Given the simple API listed above I already have a roughed out design for replication deamon. Just some suggestions. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Re: How to determine if a user exists..
On Mon, 23 Apr 2001, Dominic J. Eidson wrote: > I am trying to add another authentication mechanism to PostgreSQL... And, > in doing that, I need to verify the existance of an user within PG. Short > of hacking together code from verify_password(), is there any way to check > if a user exists in postgresql? (The actuall password verification will be > taken care of elsewhere... I just need to check if the user exists.) pg_user holds users (passwords in pg_shadow) HTH, -- Joel Burton <[EMAIL PROTECTED]> Director of Information Systems, Support Center of Washington ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Will something bad happen if I put index on xmin ?
Hi, I'm about to write a simple one-way replication script relying on xmin and would like to speed up things by putting an index on it. So I have a few questions: 1. Will something bad happen if I put index on xmin ? 2. Is it just a bad idea to do it that way ? (there will be no deletes, just mainly inserts and some updates) Hannu ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] RFC: planner statistics in 7.2
At 22:27 19/04/01 -0400, Tom Lane wrote: >Philip Warner <[EMAIL PROTECTED]> writes: >> At 21:14 19/04/01 -0400, Tom Lane wrote: >>> But you don't really need to look at the index (if it even exists >>> at the time you do the ANALYZE). The extent to which the data is >>> ordered in the table is a property of the table, not the index. > >> But the value (and cost) of using a specific index in an indexscan depends >> on that index (or am I missing something?). > >All that we're discussing here is one specific parameter in the cost >estimation for an indexscan, viz, the extent to which the table ordering >agrees with the index ordering. This does not necessarily follow. A table ordering need not follow the sort order of an index for the index to have a low indexscan cost. All that is required is that most of the rows referred to by an index node must reside in a page or pages that will be read by one IO. eg. a table that has a sequence based ID, with, say 20% of rows updated, will work nicely with an indexscan on the ID, even though it has never been clustered. What I'm suggesting is that if you look at a random sample of index nodes, you should be able to get a statistically valid estimate of the 'clumping' of the data pointed to by the index. Am I still missing the point? Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 0500 83 82 82 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Re: Replication through WAL
Bruce Momjian wrote: Okay, would it be helpful if I made a few suggestions on things that I as a user/tool developer of postgres might find helpful? > > > Is there an easy way to read the WAL files generated by Postgres? I'm > > looking into writting a replication deamon for postgres and think that > > the WAL files are the best way to know what has happened to the db and > > what has to be replicated. I have a roughed out idea of how to code it > > up but the only thing I'm really wrestling with is how read the WAL > > files. Can anyone give me some pointers. I've looked at all the xlog* > > code and have a basic understading of how it works, but not a real good > > one. Any help on this would be appreciated. Thanks > > Many believe WAL will be the basis for further replication features. > We will discuss this as part of 7.2 development in a few weeks. > > -- > Bruce Momjian| http://candle.pha.pa.us > [EMAIL PROTECTED] | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 > > ---(end of broadcast)--- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
AW: AW: [HACKERS] RFC: planner statistics in 7.2
> >> But you don't really need to look at the index (if it even exists > >> at the time you do the ANALYZE). The extent to which the data is > >> ordered in the table is a property of the table, not the index. > > > Think compound, ascending, descending and functional index. > > The (let's call it) cluster statistic for estimating indexscan cost can only > > be deduced from the index itself (for all but the simplest one column btree). > > If you want to write code that handles those cases, go right ahead ;-). > I think it's sufficient to look at the first column of a multicolumn > index for cluster-order estimation I often see first index columns that are even unique when the appl is installed for a small company (like a company id column (e.g. "mandt" in SAP)). > --- remember all these numbers are pretty crude anyway. Ok, you want to supply a value, that shows how well sorted single columns are in regard to < >. Imho this value should be stored in pg_attribute. Later someone can add a statistic to pg_index that shows how well clustered the index is. In lack of a pg_index statistic the optimizer uses the pg_attribute value of the first index column. I think that would be a good plan. > We have no such thing as a "descending index"; > and I'm not going to worry about clustering estimation for functional > indexes. Ok, an approach that reads ctid pointers from the index in index order would not need to worry about how the index is actually filled. It would need a method to sample (or read all) ctid pointers from the index in index order. Andreas ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: SET SESSION AUTHORIZATION (was Re: [HACKERS] Real/effective user)
On Sat, Apr 21, 2001 at 05:43:02PM +0200, Peter Eisentraut wrote: > I have implemented this; it seems to do what we need: > > $ ~/pg-install/bin/psql -U peter > > peter=# set session authorization 'joeblow'; > SET VARIABLE > peter=# create table foo (a int); > CREATE > peter=# \dt > List of relations > Name | Type | Owner > ---+---+- > foo | table | joeblow > test | table | peter > test2 | table | peter > (3 rows) Great! With this feature is possible use persisten connection and on-the-fly changing actual user, right? It's very useful for example web application that checking user privilege via SQL layout. I have I question, what happen with this code: (connected as superuser) set session authorization 'userA'; set session authorization 'userB'; IMHO it must be disable, right must be something like: set session authorization 'userA'; unset session authorization; <-- switch back to superuser set session authorization 'userB'; ..like as on Linux: # su - zakkr $ id -u 1000 $ su - jmarek Password: su: Authentication failure Sorry. Karel -- Karel Zak <[EMAIL PROTECTED]> http://home.zf.jcu.cz/~zakkr/ C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]