Re: [HACKERS] OID wraparound: summary and proposal
On Thu, Aug 02, 2001 at 09:28:18AM +0200, Zeugswetter Andreas SB wrote: > > > Strangely enough, I've seen no objection to optional OIDs > > other than mine. Probably it was my mistake to have formulated > > a plan on the flimsy assumption. > > I for one am more concerned about adding additional per > tuple overhead (moving from 32 -> 64bit) than loosing OID's > on some large tables. Imho optional OID's is the best way to combine > both worlds. At the same time that we announce support for optional OIDs, we should announce that, in future releases, OIDs will only be guaranteed unique (modulo wraparounds) within a single table. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Bad timestamp external representation
On Thu, Jul 26, 2001 at 05:38:23PM -0400, Bruce Momjian wrote: > Nathan Myers wrote: > > Bruce wrote: > > > > > > I can confirm that current CVS sources have the same bug. > > > > > > > It's a bug in timestamp output. > > > > > > > > # select '2001-07-24 15:55:59.999'::timestamp; > > > > ?column? > > > > --- > > > > 2001-07-24 15:55:60.00-04 > > > > (1 row) > > > > > > > > Richard Huxton wrote: > > > > > > > > > > From: "tamsin" <[EMAIL PROTECTED]> > > > > > > > > > > > Hi, > > > > > > > > > > > > Just created a db from a pg_dump file and got this error: > > > > > > > > > > > > ERROR: copy: line 602, Bad timestamp external representation > > > > > > '2000-10-03 09:01:60.00+00' > > > > > > > > > > > > I guess its a bad representation because 09:01:60.00+00 > > > > > > is actually 09:02, but how could it have got into my > > > > > > database/can I do anything about it? The value must have > > > > > > been inserted by my app via JDBC, I can't insert that value > > > > > > directly via psql. > > > > > > > > > > Seem to remember a bug in either pg_dump or timestamp > > > > > rendering causing rounding-up problems like this. If no-one > > > > > else comes up with a definitive answer, check the list > > > > > archives. If you're not running the latest release, check the > > > > > change-log. > > > > It is not a bug, in general, to generate or accept times like > > 09:01:60. Leap seconds are inserted as the 60th second of a minute. > > ANSI C defines the range of struct member tm.tm_sec as "seconds > > after the minute [0-61]", inclusive, and strftime format %S as "the > > second as a decimal number (00-61)". A footnote mentions "the range > > [0-61] for tm_sec allows for as many as two leap seconds". > > > > This is not to say that pg_dump should misrepresent stored times, > > but rather that PG should not reject those misrepresented times as > > being ill-formed. We were lucky that PG has the bug which causes it > > to reject these times, as it led to the other bug in pg_dump being > > noticed. > > We should access :60 seconds but we should round 59.99 to 1:00, right? If the xx:59.999 occurred immediately before a leap second, rounding it up to (xx+1):00.00 would introduce an error of 1.001 seconds. As I understand it, the problem is in trying to round 59.999 to two digits. My question is, why is pg_dump representing times with less precision than PostgreSQL's internal format? Should pg_dump be lossy? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Bad timestamp external representation
On Wed, Jul 25, 2001 at 06:53:21PM -0400, Bruce Momjian wrote: > > I can confirm that current CVS sources have the same bug. > > > It's a bug in timestamp output. > > > > # select '2001-07-24 15:55:59.999'::timestamp; > > ?column? > > --- > > 2001-07-24 15:55:60.00-04 > > (1 row) > > > > Richard Huxton wrote: > > > > > > From: "tamsin" <[EMAIL PROTECTED]> > > > > > > > Hi, > > > > > > > > Just created a db from a pg_dump file and got this error: > > > > > > > > ERROR: copy: line 602, Bad timestamp external representation '2000-10-03 > > > > 09:01:60.00+00' > > > > > > > > I guess its a bad representation because 09:01:60.00+00 is actually 09:02, > > > > but how could it have got into my database/can I do anything about it? > > > The > > > > value must have been inserted by my app via JDBC, I can't insert that > > > value > > > > directly via psql. > > > > > > Seem to remember a bug in either pg_dump or timestamp rendering causing > > > rounding-up problems like this. If no-one else comes up with a definitive > > > answer, check the list archives. If you're not running the latest release, > > > check the change-log. It is not a bug, in general, to generate or accept times like 09:01:60. Leap seconds are inserted as the 60th second of a minute. ANSI C defines the range of struct member tm.tm_sec as "seconds after the minute [0-61]", inclusive, and strftime format %S as "the second as a decimal number (00-61)". A footnote mentions "the range [0-61] for tm_sec allows for as many as two leap seconds". This is not to say that pg_dump should misrepresent stored times, but rather that PG should not reject those misrepresented times as being ill-formed. We were lucky that PG has the bug which causes it to reject these times, as it led to the other bug in pg_dump being noticed. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: RPM source files should be in CVS (was Re: [GENERAL] psql -l)
On Fri, Jul 20, 2001 at 07:05:46PM -0400, Trond Eivind Glomsr?d wrote: > Tom Lane <[EMAIL PROTECTED]> writes: > > > BTW, the only python shebangs I can find in CVS look like > > #! /usr/bin/env python > > Isn't that OK on RedHat? > > It is. Probably the perl scripts should say, likewise, #!/usr/bin/env perl Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] MySQL Gemini code
On Wed, Jul 18, 2001 at 06:37:48PM -0400, Trond Eivind Glomsr?d wrote: > Michael Widenius <[EMAIL PROTECTED]> writes: > > Assigning over the code is also something that FSF requires for all > > code contributions. If you criticize us at MySQL AB, you should > > also criticize the above. > > This is slightly different - FSF wants it so it will have a legal > position to defend its programs: ... > MySQL and TrollTech requires copyright assignment in order to sell > non-open licenses. Some people will have a problem with this, while > not having a problem with the FSF copyright assignment. Nobody who works on MySQL is unaware of MySQL AB's business model. Anybody who contributes to the core server has to expect that MySQL AB will need to relicense anything accepted into the core; that's their right as originators. Everybody who contributes has a choice to make: fork, or sign over. (With the GPL, forking remains possible; Apple and Sun "community" licenses don't allow it.) Anybody who contributes to PG has to make the same choice: fork, or put your code under the PG license. The latter choice is equivalent to "signing over" to all proprietary vendors, who are then free to take your code proprietary. Some of us like that. > > I had actually hoped to get support from you guys at PostgreSQL > > regarding this. You may have similar experience or at least > > understand our position. The RedHat database may be a good thing > > for PostgreSQL, but I am not sure if it's a good thing for RedHat > > or for the main developers to PostgreSQL. > > This isn't even a remotely similar situation: ... It's similar enough. One difference is that PG users are less afraid to fork. Another is that without the GPL, we have elected not to (and indeed cannot) stop any company from doing with PG what NuSphere is doing with MySQL. This is why characterizing the various licenses as more or less "business-friendly" is misleading (i.e. dishonest) -- it evades the question, "friendly to whom?". Businesses sometimes compete... Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] MySQL Gemini code
On Wed, Jul 18, 2001 at 08:35:58AM -0400, Jan Wieck wrote: > And this press release > > http://www.nusphere.com/releases/071601.htm > > also explains why they had to do it this way. They were always free to fork, but doing it the way they did -- violating MySQL AB's license -- they shot the dog. The lesson? Ask somebody competent, first, before you bet your company playing license games. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] MySQL Gemini code
On Wed, Jul 18, 2001 at 11:45:54AM -0400, Bruce Momjian wrote: > > And this press release > > > > http://www.nusphere.com/releases/071601.htm > ... > On a more significant note, I hear the word "fork" clearly suggested > in that text. It is almost like MySQL AB GPL'ed the MySQL code and > now they may not be able to keep control of it. Anybody is free to fork MySQL or PostgreSQL alike. The only difference is that all published MySQL forks must remain public, where PostgreSQL forks need not. MySQL AB is demonstrating their legal right to keep as much control as they chose, and NuSphere will lose if it goes to court. The interesting event here is that since NuSphere violated the license terms, they no longer have any rights to use or distribute the MySQL AB code, and won't until they get forgiveness from MySQL AB. MySQL AB would be within their rights to demand that the copyright to Gemini be signed over, before offering forgiveness. If Red Hat forks PostgreSQL, nobody will have any grounds for complaint. (It's been forked lots of times already, less visibly.) Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] dependent dependants
For the record: http://www.lineone.net/dictionaryof/englishusage/d0081889.html dependent or dependant "Dependent is the adjective, used for a person or thing that depends on someone or something: Admission to college is dependent on A-level results. Dependant is the noun, and is a person who relies on someone for financial support: Do you have any dependants?" This is not for mailing-list pendantism, but just to make sure that the right spelling gets into the code. (The page mentioned above was found by entering "dependent dependant" into Google.) Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Re: SOMAXCONN (was Re: Solaris source code)
On Thu, Jul 12, 2001 at 11:08:34PM +0200, Peter Eisentraut wrote: > Nathan Myers writes: > > > When the system is too heavily loaded (however measured), any further > > login attempts will fail. What I suggested is, instead of the > > postmaster accept()ing the connection, why not leave the connection > > attempt in the queue until we can afford a back end to handle it? > > Because the new connection might be a cancel request. Supporting cancel requests seems like a poor reason to ignore what load-shedding support operating systems provide. To support cancel requests, it would suffice for PG to listen at another socket dedicated to administrative requests. (It might even ignore MaxBackends for connections on that socket.) Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Re: SOMAXCONN (was Re: Solaris source code)
On Sat, Jul 14, 2001 at 11:38:51AM -0400, Tom Lane wrote: > > The state of affairs in current sources is that the listen queue > parameter is MIN(MaxBackends * 2, PG_SOMAXCONN), where PG_SOMAXCONN > is a constant defined in config.h --- it's 1, hence a non-factor, > by default, but could be reduced if you have a kernel that doesn't > cope well with large listen-queue requests. We probably won't know > if there are any such systems until we get some field experience with > the new code, but we could have "configure" select a platform-dependent > value if we find such problems. Considering the Apache comment about some systems truncating instead of limiting... 1&0xff is 16. Maybe 10239 would be a better choice, or 16383. > So, having thought that through, I'm still of the opinion that holding > off accept is of little or no benefit to us. But it's not as simple > as it looks at first glance. Anyone have a different take on what the > behavior is likely to be? After doing some more reading, I find that most OSes do not reject connect requests that would exceed the specified backlog; instead, they ignore the connection request and assume the client will retry later. Therefore, it appears cannot use a small backlog to shed load unless we assume that clients will time out quickly by themselves. OTOH, maybe it's reasonable to assume that clients will time out, and that in the normal case authentication happens quickly. Then we can use a small listen() backlog, and never accept() if we have more than MaxBackend back ends. The OS will keep a small queue corresponding to our small backlog, and the clients will do our load shedding for us. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] Re: SOMAXCONN (was Re: Solaris source code)
On Fri, Jul 13, 2001 at 07:53:02AM -0400, mlw wrote: > Zeugswetter Andreas SB wrote: > > I liked the idea of min(MaxBackends, PG_SOMAXCONN), since there is no use in > > accepting more than your total allowed connections concurrently. > > I have been following this thread and I am confused why the queue > argument to listen() has anything to do with Max backends. All the > parameter to listen does is specify how long a list of sockets open > and waiting for connection can be. It has nothing to do with the > number of back end sockets which are open. Correct. > If you have a limit of 128 back end connections, and you have 127 > of them open, a listen with queue size of 128 will still allow 128 > sockets to wait for connection before turning others away. Correct. > It should be a parameter based on the time out of a socket connection > vs the ability to answer connection requests within that period of > time. It's not really meaningful at all, at present. > There are two was to think about this. Either you make this parameter > tunable to give a proper estimate of the usability of the system, i.e. > tailor the listen queue parameter to reject sockets when some number > of sockets are waiting, or you say no one should ever be denied, > accept everyone and let them time out if we are not fast enough. > > This debate could go on, why not make it a parameter in the config > file that defaults to some system variable, i.e. SOMAXCONN. With postmaster's current behavior there is no benefit in setting the listen() argument to anything less than 1000. With a small change in postmaster behavior, a tunable system variable becomes useful. But using SOMAXCONN blindly is always wrong; that is often 5, which is demonstrably too small. > BTW: on linux, the backlog queue parameter is silently truncated to > 128 anyway. The 128 limit is common, applied on BSD and Solaris as well. It will probably increase in future releases. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: SOMAXCONN
On Fri, Jul 13, 2001 at 10:36:13AM +0200, Zeugswetter Andreas SB wrote: > > > When the system is too heavily loaded (however measured), any further > > login attempts will fail. What I suggested is, instead of the > > postmaster accept()ing the connection, why not leave the connection > > attempt in the queue until we can afford a back end to handle it? > > Because the clients would time out ? It takes a long time for half-open connections to time out, by default. Probably most clients would time out, themselves, first, if PG took too long to get to them. That would be a Good Thing. Once the SOMAXCONN threshold is reached (which would only happen when the system is very heavily loaded, because when it's not then nothing stays in the queue for long), new connection attempts would fail immediately, another Good Thing. When the system is very heavily loaded, we don't want to spare attention for clients we can't serve. > > Then, the argument to listen() will determine how many attempts can > > be in the queue before the network stack itself rejects them without > > the postmaster involved. > > You cannot change the argument to listen() at runtime, or are you suggesting > to close and reopen the socket when maxbackends is reached ? I think > that would be nonsense. Of course that would not work, and indeed nobody suggested it. If postmaster behaved a little differently, not accept()ing when the system is too heavily loaded, then it would be reasonable to call listen() (once!) with PG_SOMAXCONN set to (e.g.) N=20. Where the system is not too heavily-loaded, the postmaster accept()s the connection attempts from the queue very quickly, and the number of half-open connections never builds up to N. (This is how PG has been running already, under light load -- except that on Solaris with Unix sockets N has been too small.) When the system *is* heavily loaded, the first N attempts would be queued, and then the OS would automatically reject the rest. This is better than accept()ing any number of attempts and then refusing to authenticate. The N half-open connections in the queue would be picked up by postmaster as existing back ends drop off, or time out and give up if that happens too slowly. > I liked the idea of min(MaxBackends, PG_SOMAXCONN), since there is no > use in accepting more than your total allowed connections concurrently. That might not have the effect you imagine, where many short-lived connections are being made. In some cases it would mean that clients are rejected that could have been served after a very short delay. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Re: SOMAXCONN (was Re: Solaris source code)
On Thu, Jul 12, 2001 at 10:14:44AM +0200, Zeugswetter Andreas SB wrote: > > > The question is really whether you ever want a client to get a > > "rejected" result from an open attempt, or whether you'd rather they > > got a report from the back end telling them they can't log in. The > > second is more polite but a lot more expensive. That expense might > > really matter if you have MaxBackends already running. > > One of us has probably misunderstood the listen parameter. I don't think so. > It only limits the number of clients that can connect concurrently. > It has nothing to do with the number of clients that are already > connected. It sort of resembles a maximum queue size for the accept > loop. Incoming connections fill the queue, accept frees the queue by > taking the connection to a newly forked backend. The MaxBackends constant and the listen() parameter have no effect until the number of clients already connected or trying to connect and not yet noticed by the postmaster (respectively) exceed some threshold. We would like to choose such thresholds so that we don't promise service we can't deliver. We can assume the administrator has tuned MaxBackends so that a system with that many back ends running really _is_ heavily loaded. (We have talked about providing a better measure of load than the gross number of back ends; is that on the Todo list?) When the system is too heavily loaded (however measured), any further login attempts will fail. What I suggested is, instead of the postmaster accept()ing the connection, why not leave the connection attempt in the queue until we can afford a back end to handle it? Then, the argument to listen() will determine how many attempts can be in the queue before the network stack itself rejects them without the postmaster involved. As it is, the listen() queue limit is not useful. It could be made useful with a slight change in postmaster behavior. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Re: SOMAXCONN (was Re: Solaris source code)
On Wed, Jul 11, 2001 at 12:26:43PM -0400, Tom Lane wrote: > Peter Eisentraut <[EMAIL PROTECTED]> writes: > > Tom Lane writes: > >> Right. Okay, it seems like just making it a hand-configurable entry > >> in config.h.in is good enough for now. When and if we find that > >> that's inadequate in a real-world situation, we can improve on it... > > > Would anything computed from the maximum number of allowed connections > > make sense? > > [ looks at code ... ] Hmm, MaxBackends is indeed set before we arrive > at the listen(), so it'd be possible to use MaxBackends to compute the > parameter. Offhand I would think that MaxBackends or at most > 2*MaxBackends would be a reasonable value. > > Question, though: is this better than having a hardwired constant? > The only case I can think of where it might not be is if some platform > out there throws an error from listen() when the parameter is too large > for it, rather than silently reducing the value to what it can handle. > A value set in config.h.in would be simpler to adapt for such a platform. The question is really whether you ever want a client to get a "rejected" result from an open attempt, or whether you'd rather they got a report from the back end telling them they can't log in. The second is more polite but a lot more expensive. That expense might really matter if you have MaxBackends already running. I doubt most clients have tested either failure case more thoroughly than the other (or at all), but the lower-level code is more likely to have been cut-and-pasted from well-tested code. :-) Maybe PG should avoid accept()ing connections once it has MaxBackends back ends already running (as hinted at by Ian), so that the listen() parameter actually has some meaningful effect, and excess connections can be rejected more cheaply. That might also make it easier to respond more adaptively to true load than we do now. > BTW, while I'm thinking about it: why doesn't pqcomm.c test for a > failure return from the listen() call? Is this just an oversight, > or is there a good reason to ignore errors? The failure of listen() seems impossible. In the Linux, NetBSD, and Solaris man pages, none of the error returns mentioned are possible with PG's current use of the function. It seems as if the most that might be needed now would be to add a comment to the call to socket() noting that if any other address families are supported (besides AF_INET and AF_LOCAL aka AF_UNIX), the call to listen() might need to be looked at. AF_INET6 (which PG will need to support someday) doesn't seem to change matters. Probably if listen() did fail, then one or other of bind(), accept(), and read() would fail too. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Re: Encrypting pg_shadow passwords
On Wed, Jul 11, 2001 at 01:24:53PM +1000, Michael Samuel wrote: > The crypt authentication currently used offers _no_ security. ... > Of course, SSL *if done correctly with certificate verification* is the > correct fix. If no certificate verification is done, you fall victim to > a man-in-the-middle attack. It seems worth noting here that you don't have to depend on SSL authentication; PG can do its own authentication over SSL and avoid the man-in-the-middle attack that way. Of course, PG would have to do its authentication properly, e.g. with the HMAC method. That seems better than depending on SSL authentication, because SSL certification seems to be universally misconfigured. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: SOMAXCONN (was Re: [HACKERS] Solaris source code)
On Tue, Jul 10, 2001 at 06:36:21PM -0400, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > > All the OSes we know of fold it to 128, currently. We can jump it > > to 10240 now, or later when there are 20GHz CPUs. > > > If you want to make it more complicated, it would be more useful to > > be able to set the value lower for runtime environments where PG is > > competing for OS resources with another daemon that deserves higher > > priority. > > Hmm, good point. Does anyone have a feeling for the amount of kernel > resources that are actually sucked up by an accept-queue entry? If 128 > is the customary limit, is it actually worth worrying about whether > we are setting it to 128 vs. something smaller? I don't think the issue is the resources that are consumed by the accept-queue entry. Rather, it's a tuning knob to help shed load at the entry point to the system, before significant resources have been committed. An administrator would tune it according to actual system and traffic characteristics. It is easy enough for somebody to change, if they care, that it seems to me we have already devoted it more time than it deserves right now. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: SOMAXCONN (was Re: [HACKERS] Solaris source code)
On Tue, Jul 10, 2001 at 05:06:28PM -0400, Bruce Momjian wrote: > > Mathijs Brands <[EMAIL PROTECTED]> writes: > > > OK, I tried using 1024 (and later 128) instead of SOMAXCONN (defined to > > > be 5 on Solaris) in src/backend/libpq/pqcomm.c and ran a few regression > > > tests on two different Sparc boxes (Solaris 7 and 8). The regression > > > test still fails, but for a different reason. The abstime test fails; > > > not only on Solaris but also on FreeBSD (4.3-RELEASE). > > > > The abstime diff is to be expected (if you look closely, the test is > > comparing 'current' to 'June 30, 2001'. Ooops). If that's the only > > diff then you are in good shape. > > > > > > Based on this and previous discussions, I am strongly tempted to remove > > the use of SOMAXCONN and instead use, say, > > > > #define PG_SOMAXCONN1000 > > > > defined in config.h.in. That would leave room for configure to twiddle > > it, if that proves necessary. Does anyone know of a platform where this > > would cause problems? AFAICT, all versions of listen(2) are claimed to > > be willing to reduce the passed parameter to whatever they can handle. > > Could we test SOMAXCONN and set PG_SOMAXCONN to 1000 only if SOMAXCONN > is less than 1000? All the OSes we know of fold it to 128, currently. We can jump it to 10240 now, or later when there are 20GHz CPUs. If you want to make it more complicated, it would be more useful to be able to set the value lower for runtime environments where PG is competing for OS resources with another daemon that deserves higher priority. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: AW: [HACKERS] pg_index.indislossy
On Tue, Jul 10, 2001 at 01:36:33PM -0400, Tom Lane wrote: > Peter Eisentraut <[EMAIL PROTECTED]> writes: > > But why is this called lossy? Shouldn't it be called "exceedy"? > > Good point ;-). "lossy" does sound like the index might "lose" tuples, > which is exactly what it's not allowed to do; it must find all the > tuples that match the query. > > The terminology is correct by analogy to "lossy compression" --- the > index loses information, in the sense that its result isn't quite the > result you wanted. But I can see where it'd confuse the unwary. > Perhaps we should consult the literature and see if there is another > term for this concept. How about "hinty"? :-) Seriously, "indislossy" is a singularly poor name for a predicate. Also, are we so poor that we can't afford whole words, or even word breaks? I propose "index_is_hint". Actually, is the "ind[ex]" part even necessary? How about "must_check_heap"? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Re: Backup and Recovery
On Fri, Jul 06, 2001 at 06:52:49AM -0400, Bruce Momjian wrote: > Nathan wrote: > > How hard would it be to turn these row records into updates against a > > pg_dump image, assuming access to a good table-image file? > > pg_dump is very hard because WAL contains only tids. No way to match > that to pg_dump-loaded rows. Maybe pg_dump can write out a mapping of TIDs to line numbers, and the back-end can create a map of inserted records' line numbers when the dump is reloaded, so that the original TIDs can be traced to the new TIDs. I guess this would require a new option on IMPORT. I suppose the mappings could be temporary tables. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Doing authentication in backend
On Thu, Jun 14, 2001 at 01:42:26PM -0400, Tom Lane wrote: > Also note that we could easily fix things so that the max-number-of- > backends limit is not checked until we have passed the authentication > procedure. A PM child that's still busy authenticating doesn't have > to count. And impose a very short timeout on authentication. > Another problem with the present setup is total cost of servicing each > connection request. We've seen several complaints about connection- > refused problems under heavy load, occurring because the single > postmaster process simply can't service the requests quickly enough to > keep its accept() queue from overflowing. This last could also be addressed (along with Solaris's Unix Sockets problem!) by changing the second argument to listen(2) from the current SOMAXCONN -- which is 5 in Solaris 2.7 -- to 127. See the six-page discussion in Stevens UNPv1 beginning at page 93. This is not to say we shouldn't fork before authentication, for the above and other reasons, but the fix to listen(2)'s argument should happen anyway. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] RE: Row Versioning, for jdbc updateable result sets
On Fri, Jun 15, 2001 at 10:21:37AM -0400, Tom Lane wrote: > "Dave Cramer" <[EMAIL PROTECTED]> writes: > > I had no idea that xmin even existed, but having a quick look I think this > > is what I am looking for. Can I assume that if xmin has changed, then > > another process has changed the underlying data ? > > xmin is a transaction ID, not a process ID, but looking at it should > work for your purposes at present. > > There has been talk of redefining xmin as part of a solution to the > XID-overflow problem: what would happen is that all "sufficiently old" > tuples would get relabeled with the same special xmin, so that only > recent transactions would need to have distinguishable xmin values. > If that happens then your code would break, at least if you want to > check for changes just at long intervals. An simpler alternative was change all "sufficiently old" tuples to have an xmin value, N, equal to the oldest that would need to be distinguished. xmin values could then be compared using normal arithmetic: less(xminA, xminB) is just ((xminA - N) < (xminB - N)), with no special cases. > A hack that comes to mind is that when relabeling an old tuple this way, > we could copy its original xmin into cmin while setting xmin to the > permanently-valid XID. Then, if you compare both xmin and cmin, you > have only about a 1 in 2^32 chance of being fooled. (At least if we > use a wraparound style of allocating XIDs. I think Vadim is advocating > resetting the XID counter to 0 at each system restart, so the active > range of XIDs might be a lot smaller than 2^32 in that scenario.) That assumes a pretty frequent system restart. Many of us prefer to code to the goal of a system that could run for decades. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] What (not) to do in signal handlers
On Thu, Jun 14, 2001 at 05:10:58PM -0400, Tom Lane wrote: > Doug McNaught <[EMAIL PROTECTED]> writes: > > Tom Lane <[EMAIL PROTECTED]> writes: > >> Hm. That's one way, but is it really any cleaner than our existing > >> technique? Since you still need to assume you can do a system call > >> in a signal handler, it doesn't seem like a real gain in > >> bulletproofness to me. > > > Doing write() in a signal handler is safe; doing fprintf() (and > > friends) is not. > > If we were calling the signal handlers from random places, then I'd > agree. But we're not: we use sigblock to ensure that signals are only > serviced at the place in the postmaster main loop where select() is > called. So there's no actual risk of reentrant use of non-reentrant > library functions. > > Please recall that in practice the postmaster is extremely reliable. > The single bug we have seen with the signal handlers in recent releases > was the problem that they were clobbering errno, which was easily fixed > by saving/restoring errno. This same bug would have arisen (though at > such low probability we'd likely never have solved it) in a signal > handler that only invoked write(). So I find it difficult to buy the > argument that there's any net gain in robustness to be had here. > > In short: this code isn't broken, and so I'm not convinced we should > "fix" it. Formally speaking, it *is* broken: we depend on semantics that are documented as unportable and undefined. In a sense, we have been so unlucky as not to have perceived, thus far, the undefined effects. This is no different from depending on finding a NUL at *(char*)0, or on being able to say "free(p); p = p->next;". Yes, it appears to work, at the moment, on some platforms, but that doesn't make it correct. It may not be terribly urgent to fix it right now, but that's far from "isn't broken". It at least merits a TODO entry. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] What (not) to do in signal handlers
On Thu, Jun 14, 2001 at 04:27:14PM -0400, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > > It could open a pipe, and write(2) a byte to it in the signal handler, > > and then have select(2) watch that pipe. (SIGHUP could use the same pipe.) > > Of course this is still a system call in a signal handler, but it can't > > (modulo coding bugs) fail. > > Hm. That's one way, but is it really any cleaner than our existing > technique? Since you still need to assume you can do a system call > in a signal handler, it doesn't seem like a real gain in > bulletproofness to me. Quoting Stevens (UNPv2, p. 90), Posix uses the term *async-signal-safe* to describe the functions that may be called from a signal handler. Figure 5.10 lists these Posix functions, along with a few that were added by Unix98. Functions not listed may not be called from a signal andler. Note that none of the standard I/O functions ... are listed. Of call the IPC functions covered in this text, only sem_post, read, and write are listed (we are assuming the latter two would be used with pipes and FIFOs). Restricting the handler to use those in the approved list seems like an automatic improvement to me, even in the apparent absence of evidence of problems on those platforms that happen to get tested most. > > A pipe per backend might be considered pretty expensive. > > Pipe per postmaster, no? That doesn't seem like a huge cost. I haven't looked at how complex the signal handling in the backends is; maybe they don't need anything this fancy. (OTOH, maybe they should be using a pipe to communicate with postmaster, instead of using signals.) > I'd be > more concerned about the two extra kernel calls (write and read) per > signal received, actually. Are there so many signals flying around? The signal handler would check a flag before writing, so a storm of signals would result in only one call to write, and one call to read, per select loop. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] What (not) to do in signal handlers
On Thu, Jun 14, 2001 at 02:18:40PM -0400, Tom Lane wrote: > Peter Eisentraut <[EMAIL PROTECTED]> writes: > > I notice that the signal handlers in postmaster.c do quite a lot of work, > > much more than what they teach you in school they should do. > > Yes, they're pretty ugly. However, we have not recently heard any > complaints suggesting problems with it. Since we block signals > everywhere except just around the select() for new input, there's not > really any risk of recursive resource use AFAICS. > > > ISTM that most of these, esp. pmdie(), can be written more like the SIGHUP > > handler, i.e., set a global variable and evaluate right after the > > select(). > > I would love to see it done that way, *if* you can show me a way to > guarantee that the signal response will happen promptly. AFAIK there's > no portable way to ensure that we don't end up sitting and waiting for a > new client message before we get past the select(). It could open a pipe, and write(2) a byte to it in the signal handler, and then have select(2) watch that pipe. (SIGHUP could use the same pipe.) Writing to and reading from your own pipe can be a recipe for deadlock, but here it would be safe if the signal handler knows not to get too far ahead of select. (The easy way would be to allow no more than one byte in the pipe per signal handler.) Of course this is still a system call in a signal handler, but it can't (modulo coding bugs) fail. See Stevens, "Unix Network Programming, Vol. 2, Interprocess Communication", p. 91, Figure 5.10, "Functions that are async-signal-safe". The figure lists write() among others. Sample code implementing the above appears on page 94. Examples using other techniques (sigwait, nonblocking mq_receive) are presented also. A pipe per backend might be considered pretty expensive. Does UNIX allocate a pipe buffer before there's anything to put in it? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] Re: Australian timezone configure option
On Thu, Jun 14, 2001 at 12:23:22AM +, Thomas Lockhart wrote: > > Surely the correct solution is to have a config file somewhere > > that gets read on startup? That way us Australians don't have to be the only > > ones in the world that need a custom built postgres. > > I will point out that "you Australians", and, well, "us 'mericans", are > the only countries without the sense to choose unique conventions for > time zone names. > > It sounds like having a second lookup table for the Australian rules is > a possibility, and this sounds fairly reasonable to me. Btw, is there an > Australian convention for referring to North American time zones for > those zones with naming conflicts? For years I've been on the TZ list, the announcement list for a community-maintained database of time zones. One point they have firmly established is that there is no reasonable hope of making anything like a standard system of time zone name abbreviations work. Legislators and dictators compete for arbitrariness in their time zone manipulations. Even if you assign, for your own use, an abbreviation to a particular administrative region, you still need a history of legislation for that region to know what any particular time record (particularly and April or September) really means. The "best practice" for annotating times is to tag them with the numeric offset from UTC at the time the sample is formed. If the time sample is the present time, you don't have to know very much make or use it. If it's in the past, you have to know the legislative history of the place to form a proper time record, but not to use it. If the time is in the future, you cannot know what offset will be in popular use at that time, but at least you can be precise about what actual time you really mean, even if you can't be sure about what the wall clock says. (Actual wall clock times are not reliably predictable, a fact that occasionally makes things tough on airline passengers.) Things are a little more stable in some places (e.g. in Europe it is improving) but worldwide all is chaos. Assigning some country's current abbreviations at compile time is madness. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Idea: quicker abort after loss of client connection
On Tue, Jun 05, 2001 at 08:01:02PM -0400, Tom Lane wrote: > > Thoughts? Is there anything about this that might be unsafe? Should > QueryCancel be set after *any* failure of recv() or send(), or only > if certain errno codes are detected (and if so, which ones)? Stevens identifies some errno codes that are not significant; in particular, EINTR, EAGAIN, and EWOULDBLOCK. Of these, maybe only the first occurs on a blocking socket. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Re: Interesting Atricle
On Mon, Jun 04, 2001 at 04:55:13PM -0400, Bruce Momjian wrote: > > This is getting off-topic, but ... > > > > I keep CSS, Javascript, Java, dynamic fonts, and images turned off, and > > Netscape 4.77 stays up for many weeks at a time. I also have no Flash > > plugin. All together it makes for a far more pleasant web experience. > > > > I didn't notice any problem with the Zend page. > > You are running no images! You may as well have Netscape minimized and > say it is running for weeks. :-) Over 98% of the images on the web are either pr0n or wankage. If you don't need to see that, you can save a lot of time. But it's usually Javascript that crashes Netscape. (CSS appears to be implemented using Javascript, because if you turn off Javascript, then CSS stops working (and crashing).) That's not to say that Java doesn't also crash Netscape; it's just that pages with Java in them are not very common. There's little point in bookmarking a site that depends on client-side Javascript or Java, because it won't be up for very long. But this is *really* off topic, now. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Re: Interesting Atricle
On Sat, Jun 02, 2001 at 10:59:20AM -0400, Vince Vielhaber wrote: > On Fri, 1 Jun 2001, Bruce Momjian wrote: > > > > > Thought some people might find this article interesting. > > > > http://www.zend.com/zend/art/databases.php > > > > > > The only interesting thing I noticed is how fast it crashes my > > > Netscape-4.76 browser ;) > > > > Yours too? I turned off Java/Javascript to get it to load and I am on > > BSD/OS. Strange it so univerally crashes. > > Really odd. I have Java/Javascript with FreeBSD and Netscape 4.76 and > read it just fine. One difference tho probably, I keep style sheets > shut off. Netscape crashes about 1% as often as it used to. This is getting off-topic, but ... I keep CSS, Javascript, Java, dynamic fonts, and images turned off, and Netscape 4.77 stays up for many weeks at a time. I also have no Flash plugin. All together it makes for a far more pleasant web experience. I didn't notice any problem with the Zend page. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Imperfect solutions
On Thu, May 31, 2001 at 10:07:36AM -0400, Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > What got me thinking about this is that I don't think my gram.y fix > > would be accepted given the current review process, > > Not to put too fine a point on it: the project has advanced a long way > since you did that code. Our standards *should* be higher than they > were then. > > > and that is bad > > because we would have to live with no LIKE optimization for 1-2 years > > until we learned how to do it right. > > We still haven't learned how to do it right, actually. I think the > history of the LIKE indexing problem is a perfect example of why fixes > that work for some people but not others don't survive long. We put out > several attempts at making it work reliably in non-ASCII locales, but > none of them have withstood the test of actual usage. > > > I think there are a few rules we can use to decide how to deal with > > imperfect solutions: > > You forgot > > * will the fix institutionalize user-visible behavior that will in the > long run be considered the wrong thing? > > * will the fix contort new code that is written in the same vicinity, > thereby making it harder and harder to replace as time goes on? > > The first of these is the core of my concern about %TYPE. This list points up a problem that needs a better solution than a list: you have to put in questionable features now to get the usage experience you need to do it right later. The set of prospective features that meet that description does not resemble the set that would pass all the criteria in the list. This is really a familiar problem, with a familiar solution. When a feature is added that is "wrong", make sure it's "marked" somehow -- at worst, in the documentation, but ideally with a NOTICE or something when it's used -- as experimental. If anybody complains later that when you ripped it out and redid it correctly, you broke his code, you can just laugh, and add, if you're feeling charitable, "experimental features are not to be depended on". -- Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Re: charin(), text_char() should return something else for empty input
On Mon, May 28, 2001 at 02:37:32PM -0400, Tom Lane wrote: > I wrote: > > I propose that both of these operations should return a space character > > for an empty input string. This is by analogy to space-padding as you'd > > get with char(1). Any objections? > > An alternative approach is to make charin and text_char map empty > strings to the null character (\0), and conversely make charout and > char_text map the null character to empty strings. charout already > acts that way, in effect, since it has to produce a null-terminated > C string. This way would have the advantage that there would still > be a reversible dump and reload representation for a "char" field > containing '\0', whereas space-padding would cause such a field to > become ' ' after reload. But it's a little strange if you think that > "char" ought to behave the same as char(1). Does the standard require any particular behavior in with NUL characters? I'd like to see PG move toward treating them as ordinary control characters. I realize that at best it will take a long time to get there. C is irretrievably mired in the "NUL is a terminator" swamp, but SQL isn't C. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] BSD gettext
On Thu, May 24, 2001 at 10:30:01AM -0400, Bruce Momjian wrote: > > The HPUX man page for mmap documents its failure return value as "-1", > > so I hacked around this with > > > > #ifndef MAP_FAILED > > #define MAP_FAILED ((void *) (-1)) > > #endif > > > > whereupon it built and passed the simple self-test you suggested. > > However, I think it's pretty foolish to depend on mmap for such > > little reason as this code does. I suggest ripping out the mmap > > usage and just reading the file with good old read(2). > > Agreed. Let read() use mmap() internally if it wants to. The reason mmap() is faster than read() is that it can avoid copying data to the place you specify. read() can "use mmap() internally" only in cases rare enough to hardly be worth checking for. Stdio is often able to use mmap() internally for parsing, and in glibc-2.x (and, I think, on recent Solarix and BSDs) it does. Usually, therefore, it would be better to use stdio functions (except fread()!) in place of read(), where possible, to allow this optimization. Using mmap() in place of disk read() almost always results in enough performance improvement to make doing so worth a lot of disruption. Today mmap() is used heavily enough, in important programs, that worries about unreliability are no better founded than worries about read(). Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] C++ Headers
On Wed, May 23, 2001 at 11:35:31AM -0400, Bruce Momjian wrote: > > > > > We have added more const-ness to libpq++ for 7.2. > > > > > > > > Breaking link compatibility without bumping the major version number > > > > on the library seems to me serious no-no. > > > > > > > > To const-ify member functions without breaking link compatibility, > > > > you have to add another, overloaded member that is const, and turn > > > > the non-const function into a wrapper. For example: > > > > > > > > void Foo::bar() { ... } // existing interface > > > > > > > > becomes > > > > > > > > void Foo::bar() { ((const Foo*)this)->bar(); } > > > > void Foo::bar() const { ... } > > > > > > Thanks. That was my problem, not knowing when I break link compatiblity > > > in C++. Major updated. > > > > Wouldn't it be better to add the forwarding function and keep > > the same major number? It's quite disruptive to change the > > major number for what are really very minor changes. Otherwise > > you accumulate lots of near-copies of almost-identical libraries > > to be able to run old binaries. > > > > A major-number bump should usually be something planned for > > and scheduled. > > That const was just one of many const's added, and I am sure there will > be more stuff happening to C++. I changed a function returning short > for tuple length to int. Not worth mucking it up. > > If it was just that one it would be OK. I'll bet lots of people would like to see more careful planning about breaking link compatibility. Other changes that break link compatibility include changing a struct or class referred to from inline functions, and adding a virtual function in a base class. It's possible to make a lot of improvements without breaking link compatibility, but it does take more care than in C. If you wonder whether a change would break link compatibility, please ask on the list. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] More pgindent follies
On Wed, May 23, 2001 at 11:58:51AM -0400, Bruce Momjian wrote: > > > I don't see the problem here. My assumption is that the comment is not > > > part of the define, right? > > > > Well, that's the question. ANSI C requires comments to be replaced by > > whitespace before preprocessor commands are detected/executed, but there > > was an awful lot of variation in preprocessor behavior before ANSI. > > I suspect there are still preprocessors out there that might misbehave > > on this input --- for example, by leaving the text "* end-of-string */" > > present in the preprocessor output. Now we still go to considerable > > lengths to support not-quite-ANSI preprocessors. I don't like the idea > > that all the work done by configure and c.h in that direction might be > > wasted because of pgindent carelessness. > > I agree, but in a certain sense, we would have found those compilers > already. This is not new behavour as far as I know, and clearly this > would throw a compiler error. This is good news! Maybe this process can be formalized. That is, each official release migh contain a source file with various "modern" constructs which we suspect might break old compilers. A comment block at the top requests that any breakage be reported. A configure option would allow a user to avoid compiling it, and a comment in the file would explain how to use the option. After a major release, any modern construct that caused no trouble in the last release is considered OK to use. This process makes it easy to leave behind obsolete language restrictions: if you wonder if it's OK now to use a feature that once broke some crufty platform, drop it in modern.c and forget about it. After the next release, you know the answer. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] C++ Headers
On Tue, May 22, 2001 at 05:52:20PM -0400, Bruce Momjian wrote: > > On Tue, May 22, 2001 at 12:19:41AM -0400, Bruce Momjian wrote: > > > > This in fact has happened within ECPG. But since sizeof(bool) is > > > > passed to libecpg it was possible to figure out which 'bool' is > > > > requested. > > > > > > > > Another issue of C++ compatibility would be cleaning up the > > > > usage of 'const' declarations. C++ is really strict about > > > > 'const'ness. But I don't know whether postgres' internal headers > > > > would need such a cleanup. (I suspect that in ecpg there is an > > > > oddity left with respect to host variable declaration. I'll > > > > check that later) > > > > > > We have added more const-ness to libpq++ for 7.2. > > > > Breaking link compatibility without bumping the major version number > > on the library seems to me serious no-no. > > > > To const-ify member functions without breaking link compatibility, > > you have to add another, overloaded member that is const, and turn > > the non-const function into a wrapper. For example: > > > > void Foo::bar() { ... } // existing interface > > > > becomes > > > > void Foo::bar() { ((const Foo*)this)->bar(); } > > void Foo::bar() const { ... } > > Thanks. That was my problem, not knowing when I break link compatiblity > in C++. Major updated. Wouldn't it be better to add the forwarding function and keep the same major number? It's quite disruptive to change the major number for what are really very minor changes. Otherwise you accumulate lots of near-copies of almost-identical libraries to be able to run old binaries. A major-number bump should usually be something planned for and scheduled. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] C++ Headers
On Tue, May 22, 2001 at 12:19:41AM -0400, Bruce Momjian wrote: > > This in fact has happened within ECPG. But since sizeof(bool) is passed to > > libecpg it was possible to figure out which 'bool' is requested. > > > > Another issue of C++ compatibility would be cleaning up the usage of > > 'const' declarations. C++ is really strict about 'const'ness. But I don't > > know whether postgres' internal headers would need such a cleanup. (I > > suspect that in ecpg there is an oddity left with respect to host variable > > declaration. I'll check that later) > > We have added more const-ness to libpq++ for 7.2. Breaking link compatibility without bumping the major version number on the library seems to me serious no-no. To const-ify member functions without breaking link compatibility, you have to add another, overloaded member that is const, and turn the non-const function into a wrapper. For example: void Foo::bar() { ... } // existing interface becomes void Foo::bar() { ((const Foo*)this)->bar(); } void Foo::bar() const { ... } Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Plans for solving the VACUUM problem
On Fri, May 18, 2001 at 06:10:10PM -0700, Mikheev, Vadim wrote: > > Vadim, can you remind me what UNDO is used for? > > Ok, last reminder -:)) > > On transaction abort, read WAL records and undo (rollback) > changes made in storage. Would allow: > > 1. Reclaim space allocated by aborted transactions. > 2. Implement SAVEPOINTs. >Just to remind -:) - in the event of error discovered by server >- duplicate key, deadlock, command mistyping, etc, - transaction >will be rolled back to the nearest implicit savepoint setted >just before query execution; - or transaction can be aborted by >ROLLBACK TO command to some explicit savepoint >setted by user. Transaction rolled back to savepoint may be continued. > 3. Reuse transaction IDs on postmaster restart. > 4. Split pg_log into small files with ability to remove old ones (which >do not hold statuses for any running transactions). I missed the original discussions; apologies if this has already been beaten into the ground. But... mightn't sub-transactions be a better-structured way to expose this service? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Upgrade issue (again).
On Thu, May 17, 2001 at 12:43:49PM -0400, Rod Taylor wrote: > Best way to upgrade might bee to do something as simple as get the > master to master replication working. Master-to-master replication is not simple, and (fortunately) isn't strictly necessary. The minimal sequence is, 1. Start a backup and a redo log at the same time. 2. Start the new database and read the backup. 3. Get the new database consuming the redo logs. 4. When the new database catches up, make it a hot failover for the old. 5. Turn off the old database and fail over. The nice thing about this approach is that all the parts used are essential parts of an enterprise database anyway, regardless of their usefulness in upgrading. Master-to-master replication is nice for load balancing, but not necessary for failover. Its chief benefit, there, is that you wouldn't need to abort the uncompleted transactions on the old database when you make the switch. But master-to-master replication is *hard* to make work, and intrusive besides. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] storage density
When organizing available free storage for re-use, we will probably have a choice whether to favor using space in (mostly-) empty blocks, or in mostly-full blocks. Empty and mostly-empty blocks are quicker -- you can put lots of rows in them before they fill up and you have to choose another. Preferring mostly-full blocks improves active-storage and cache density because a table tends to occupy fewer total blocks. Does anybody know of papers that analyze the tradeoffs involved? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Re: "End-to-end" paper
On Thu, May 17, 2001 at 06:04:54PM +0800, Lincoln Yeoh wrote: > At 12:24 AM 17-05-2001 -0700, Nathan Myers wrote: > > > >For those of you who have missed it, here > > > >>http://www.google.com/search?q=cache:web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf+clark+end+to+end&hl=en > > > >is the paper some of us mention, "END-TO-END ARGUMENTS IN SYSTEM DESIGN" > >by Saltzer, Reed, and Clark. > > > >The abstract is: > > > >This paper presents a design principle that helps guide placement > >of functions among the modules of a distributed computer system. > >The principle, called the end-to-end argument, suggests that > >functions placed at low levels of a system may be redundant or > >of little value when compared with the cost of providing them > >at that low level. Examples discussed in the paper include > >bit error recovery, security using encryption, duplicate > >message suppression, recovery from system crashes, and delivery > >acknowledgement. Low level mechanisms to support these functions > >are justified only as performance enhancements. > > > >It was written in 1981 and is undiminished by the subsequent decades. > > Maybe I don't understand the paper. Yes. It bears re-reading. > The end-to-end argument might be true if taking the monolithic > approach. I find more useful ideas gleaned from the RFCs, TCP/IP and > the OSI 7 layer model: modularity, "useful standard interfaces", "Be > liberal in what you accept, and conservative in what you send" and so > on. The end-to-end principle has had profound effects on the design of Internet protocols, perhaps most importantly in keeping them simpler than OSI's. > Within a module I figure the end to end argument might hold, The end-to-end principle isn't particularly applicable within a module. It's a system-design principle. Its prescription for individual modules is: don't imagine that anybody else gets much value from your complex error recovery shenanigans; they have to do their own error recovery anyway. You provide more value by making a good effort. > but the author keeps talking about networks and networking. Of course networking is just an example, but it's a particularly good example. Data storage (e.g. disk) is another good example; in the context of the paper it may be thought of as a mechanism for communicating with other (later) times. The point there is that the CRCs and ECC performed by the disk are not sufficient to ensure reliability for the system (e.g. database service); for that, end-to-end measures such as hot-failover, backups, redo logs, and block- or record-level CRCs are needed. The purpose of the disk CRCs is not reliability, a job they cannot do alone, but performance: they help make the need to use the backups and redo logs infrequent enough to be tolerable. > SSL and TCP are useful. The various CRC checks down the IP stack to > the datalink layer have their uses too. Yes, of course they are useful. The authors say so in the paper, and they say precisely how (and how not). > By splitting stuff up at appropriate points, adding or substituting > objects at various layers becomes so much easier. People can download > Postgresql over token ring, Gigabit ethernet, X.25 and so on. As noted in the paper, the principle is most useful in helping to decide what goes in each layer. > Splitting stuff up does mean that the bits and pieces now do have > a certain responsibility. If those responsibilities involve some > redundancies in error checking or encryption or whatever, so be > it, because if done well people can use those bits and pieces in > interesting ways never dreamed of initially. > > For example SSL over TCP over IPSEC over encrypted WAP works (even > though IPSEC is way too complicated :)). There's so much redundancy > there, but at the same time it's not a far fetched scenario - just > someone ordering online on a notebook pc. The authors quote a similar example in the paper, even though it was written twenty years ago. > But if a low level module never bothered with error > correction/detection/handling or whatever and was optimized for > an application specific purpose, it's harder to use it for other > purposes. And if you do, some chap could post an article to Bugtraq on > it, mentioning exploit, DoS or buffer overflow. The point is that leaving that stuff _out_ is how you keep low-level mechanisms useful for a variety of purposes. Putting in complicated error-recovery stuff might suit it better for a particular application, but make it less suitable for others. This is why, at the IP layer, packets get tossed at the first sign of congestion. It's why
[HACKERS] "End-to-end" paper
For those of you who have missed it, here http://www.google.com/search?q=cache:web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf+clark+end+to+end&hl=en is the paper some of us mention, "END-TO-END ARGUMENTS IN SYSTEM DESIGN" by Saltzer, Reed, and Clark. The abstract is: This paper presents a design principle that helps guide placement of functions among the modules of a distributed computer system. The principle, called the end-to-end argument, suggests that functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level. Examples discussed in the paper include bit error recovery, security using encryption, duplicate message suppression, recovery from system crashes, and delivery acknowledgement. Low level mechanisms to support these functions are justified only as performance enhancements. It was written in 1981 and is undiminished by the subsequent decades. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Configurable path to look up dynamic libraries
On Tue, May 15, 2001 at 05:53:36PM -0400, Bruce Momjian wrote: > > But, if I may editorialize a little myself, this is just indicative of a > > 'Fortress PostgreSQL' attitude that is easy to get into. 'We've always > > I have to admit I like the sound of 'Fortress PostgreSQL'. :-) Ye Olde PostgreSQL Shoppe The PostgreSQL of Giza Our Lady of PostgreSQL, Ascendant PostgreSQL International Airport PostgreSQL Galactica PostgreSQL's Tavern ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] tables/indexes/logs on different volumes
On Wed, Apr 25, 2001 at 09:41:57AM -0300, The Hermit Hacker wrote: > On Tue, 24 Apr 2001, Nathan Myers wrote: > > > On Tue, Apr 24, 2001 at 11:28:17PM -0300, The Hermit Hacker wrote: > > > I have a Dual-866, 1gig of RAM and strip'd file systems ... this past > > > week, I've hit many times where CPU usage is 100%, RAM is 500Meg free > > > and disks are pretty much sitting idle ... > > > > Assuming "strip'd" above means "striped", it strikes me that you > > might be much better off operating the drives independently, with > > the various tables, indexes, and logs scattered each entirely on one > > drive. > > have you ever tried to maintain a database doing this? PgSQL is > definitely not designed for this sort of setup, I had symlinks going > everywhere, and with the new numbering schema, this is even more > difficult to try and do :) Clearly you need to build a tool to organize it. It would help a lot if PG itself could provide some basic assistance, such as calling a stored procedure to generate the pathname of the file. Has there been any discussion of anything like that? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Cursor support in pl/pg
Now that 7.1 is safely in the can, is it time to consider this patch? It provides cursor support in PL. http://www.airs.com/ian/postgresql-cursor.patch Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Re: refusing connections based on load ...
On Tue, Apr 24, 2001 at 11:28:17PM -0300, The Hermit Hacker wrote: > I have a Dual-866, 1gig of RAM and strip'd file systems ... this past > week, I've hit many times where CPU usage is 100%, RAM is 500Meg free and > disks are pretty much sitting idle ... Assuming "strip'd" above means "striped", it strikes me that you might be much better off operating the drives independently, with the various tables, indexes, and logs scattered each entirely on one drive. That way the heads can move around independently reading and writing N blocks, rather than all moving in concert reading or writing only one block at a time. (Striping the WAL file on a couple of raw devices might be a good idea along with the above. Can we do that?) But of course speculation is much less useful than trying it. Some measurements before and after would be really, really interesting to many of us. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: refusing connections based on load ...
On Tue, Apr 24, 2001 at 12:39:29PM +0800, Lincoln Yeoh wrote: > At 03:09 PM 23-04-2001 -0300, you wrote: > >Basically, if great to set max clients to 256, but if load hits 50 > >as a result, the database is near to useless ... if you set it to 256, > >and 254 idle connections are going, load won't rise much, so is safe, > >but if half of those processes are active, it hurts ... > > Sorry, but I still don't understand the reasons why one would want to do > this. Could someone explain? > > I'm thinking that if I allow 256 clients, and my hardware/OS bogs down > when 60 users are doing lots of queries, I either accept that, or > figure that my hardware/OS actually can't cope with that many clients > and reduce the max clients or upgrade the hardware (or maybe do a > little tweaking here and there). > > Why not be more deterministic about refusing connections and stick > to reducing max clients? If not it seems like a case where you're > promised something but when you need it, you can't have it. The point is that "number of connections" is a very poor estimate of system load. Sometimes a connection is busy, sometimes it's not. Some connections are busy, some are not. The goal is maximum throughput or some tradeoff of maximum throughput against latency. If system throughput varies nonlinearly with load (as it almost always does) then this happens at some particular load level. Refusing a connection and letting the client try again later can be a way to maximize throughput by keeping the system at the optimum point. (Waiting reduces delay. Yes, this is counterintuitive, but why do we queue up at ticket windows?) Delaying response, when under excessive load, to clients who already have a connection -- even if they just got one -- can have a similar effect, but with finer granularity and with less complexity in the clients. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] refusing connections based on load ...
On Mon, Apr 23, 2001 at 10:50:42PM -0400, Tom Lane wrote: > Basically, if we do this then we are abandoning the notion that Postgres > runs as an unprivileged user. I think that's a BAD idea, especially in > an environment that's open enough that you might feel the need to > load-throttle your users. By definition you do not trust them, eh? No. It's not a case of trust, but of providing an adaptive way to keep performance reasonable. The users may have no independent way to cooperate to limit load, but the DB can provide that. > A less dangerous way of approaching it might be to have an option > whereby the postmaster invokes 'uptime' via system() every so often > (maybe once a minute?) and throttles on the basis of the results. > The reaction time would be poorer, but security would be a whole lot > better. Yes, this alternative looks much better to me. On Linux you have the much more efficient alternative, /proc/loadavg. (I wouldn't use system(), though.) Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] refusing connections based on load ...
On Mon, Apr 23, 2001 at 03:09:53PM -0300, The Hermit Hacker wrote: > > Anyone thought of implementing this, similar to how sendmail does it? If > load > n, refuse connections? > ... > If nobody is working on something like this, does anyone but me feel that > it has merit to make use of? I'll play with it if so ... I agree that it would be useful. Even more useful would be soft load shedding, where once some load average level is exceeded the postmaster delays a bit (proportionately) before accepting a connection. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Re: Is it possible to mirror the db in Postgres?
On Fri, Apr 20, 2001 at 04:53:43PM -0700, G. Anthony Reina wrote: > Nathan Myers wrote: > > > Does the replication have to be reliable? Are you equipped to > > reconcile databases that have got out of sync, when it's not? > > Will the different labs ever try to update the same existing > > record, or insert conflicting (unique-key) records? > > (1) Yes, of course. (2) Willing--yes; equipped--dunno. (3) Yes, > probably. Hmm, good luck. Replication, by itself, is not hard, but it's only a tiny part of the job. Most of the job is in handling failures and conflicts correctly, for some (usually enormous) definition of "correctly". > > Reliable WAN replication is harder. Most of the proprietary database > > companies will tell you they can do it, but their customers will tell > > you they can't. > > Joel Burton suggested the rserv utility. I don't know how well it would > work over a wide network. The point about WANs is that things which work nicely in the lab, on a LAN, behave very differently when the communication medium is, like the Internet, only fitfully reliable. You will tend to have events occurring in unexpected order, and communications lost, and queues topping over, and conflicting entries in different instances which you must somehow reconcile after the fact. Reconciliation by shipping the whole database across the WAN is often impractical, particularly when you're trying to use it at the same time. WAN replication is an important part of Zembu's business, and it's hard. I would expect the rserv utility (about which I admit I know little) not to have been designed for the job. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Is it possible to mirror the db in Postgres?
On Fri, Apr 20, 2001 at 03:33:38PM -0700, G. Anthony Reina wrote: > We use Postgres 7.0.3 to store data for our scientific research. We have > two other labs in St. Louis, MO and Tempe, AZ. I'd like to see if > there's a way for them to mirror our database. They would be able to > update our database when they received new results and we would be able > to update theirs. So, in effect, we'd have 3 copies of the same db. Each > copy would be able to update the other. > > Any thoughts on if this is possible? Does the replication have to be reliable? Are you equipped to reconcile databases that have got out of sync, if not? Will the different labs ever try to update the same existing record, or insert conflicting (unique-key) records? Symmetric replication is easy or impossible, but usually somewhere in between, depending on many details. Usually when it's made to work, it runs on a LAN. Reliable WAN replication is harder. Most of the proprietary database companies will tell you they can do it, but their customers will tell you they can't. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] timeout on lock feature
On Wed, Apr 18, 2001 at 09:39:39PM -0400, Bruce Momjian wrote: > > On Wed, Apr 18, 2001 at 07:33:24PM -0400, Bruce Momjian wrote: > > > > What might be a reasonable alternative would be a BEGIN timeout: > > > > report failure as soon as possible after N seconds unless the > > > > timer is reset, such as by a commit. Such a timeout would be > > > > meaningful at the database-interface level. It could serve as a > > > > useful building block for application-level timeouts when the > > > > client environment has trouble applying timeouts on its own. > > > > > > Now that is a nifty idea. Just put it on one command, BEGIN, and > > > have it apply for the whole transaction. We could just set an > > > alarm and do a longjump out on timeout. > > > > Of course, it begs the question why the client couldn't do that > > itself, and leave PG out of the picture. But that's what we've > > been talking about all along. > > Yes, they can, but of course, they could code the database in the > application too. It is much easier to put the timeout in a psql script > than to try and code it. Good: add a timeout feature to psql. There's no limit to what features you might add to the database core once you decide that new features need have nothing to do with databases. Why not (drum roll...) deliver e-mail? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] timeout on lock feature
On Wed, Apr 18, 2001 at 07:33:24PM -0400, Bruce Momjian wrote: > > What might be a reasonable alternative would be a BEGIN timeout: report > > failure as soon as possible after N seconds unless the timer is reset, > > such as by a commit. Such a timeout would be meaningful at the > > database-interface level. It could serve as a useful building block > > for application-level timeouts when the client environment has trouble > > applying timeouts on its own. > > Now that is a nifty idea. Just put it on one command, BEGIN, and have > it apply for the whole transaction. We could just set an alarm and do a > longjump out on timeout. Of course, it begs the question why the client couldn't do that itself, and leave PG out of the picture. But that's what we've been talking about all along. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] timeout on lock feature
On Wed, Apr 18, 2001 at 09:54:11AM +0200, Zeugswetter Andreas SB wrote: > > > In short, I think lock timeout is a solution searching in vain for a > > > problem. If we implement it, we are just encouraging bad application > > > design. > > > > I agree with Tom completely here. > > > > In any real-world application the database is the key component of a > > larger system: the work it does is the most finicky, and any mistakes > > (either internally or, more commonly, from misuse) have the most > > far-reaching consequences. The responsibility of the database is to > > provide a reliable and easily described and understood mechanism to > > build on. > > It is not something that makes anything unrelyable or less robust. > It is also simple: "I (the client) request that you (the backend) > dont wait for any lock longer than x seconds" Many things that are easy to say have complicated consequences. > > Timeouts are a system-level mechanism that to be useful must refer to > > system-level events that are far above anything that PG knows about. > > I think you are talking about different kinds of timeouts here. Exactly. I'm talking about useful, meaningful timeouts, not random timeouts attached to invisible events within the database. > > The only way PG could apply reasonable timeouts would be for the > > application to dictate them, > > That is exactly what we are talking about here. No. You wrote elsewhere that the application sets "30 seconds" and leaves it. But that 30 seconds doesn't have any application-level meaning -- an operation could take twelve hours without tripping your 30-second timeout. For the application to dictate the timeouts reasonably, PG would have to expose all its lock events to the client and expect it to deduce how they affect overall behavior. > > but the application can better implement them itself. > > It can, but it makes the program more complicated (needs timers > or threads, which violates your last statement "simplest interface". It is good for the program to be more complicated if it is doing a more complicated thing -- if it means the database may remain simple. People building complex systems have an even greater need for simple components than people building little ones. What might be a reasonable alternative would be a BEGIN timeout: report failure as soon as possible after N seconds unless the timer is reset, such as by a commit. Such a timeout would be meaningful at the database-interface level. It could serve as a useful building block for application-level timeouts when the client environment has trouble applying timeouts on its own. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] CRN article not updated
On Wed, Apr 18, 2001 at 02:22:48PM -0400, Bruce Momjian wrote: > I just checked the CRN PostgreSQL article at: > >http://www.crn.com/Sections/Fast_Forward/fast_forward.asp?ArticleID=25670 > > I see no changes to the article, even though Vince our webmaster, Geoff > Davidson of PostgreSQL, Inc, and Dave Mele of Great Bridge have > requested it be fixed. If _you_ had been deluged with that kind of vitriol, what kind of favors would you feel like doing? > Not sure what we can do now. It's too late. "We" screwed it up. (Thanks again, guys.) The responses have done far more lasting damage than any article could ever have done. The horse is dead. The best we can do is to plan for the future. 1. What happens the next time a slightly inaccurate article is published? 2. What happens when an openly hostile article is published? Will our posse ride off again with guns blazing, making more enemies? Will they make us all look to potential users like a bunch of hotheaded, childish nobodies? Or will we have somebody appointed, already, to write a measured, rational, mature clarification? Will we have articles already written, and handed to more responsible reporters, so that an isolated badly-done article can do little damage? We're not even on Oracle's radar yet. When PG begins to threaten their income, their marketing department will go on the offensive. Oracle marketing is very, very skillful, and very, very nasty. If they find that by seeding the press with reasonable-sounding criticisms of PG, they can prod the PG community into making itself look like idiots, they will go to town on it. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] timeout on lock feature
On Tue, Apr 17, 2001 at 12:56:11PM -0400, Tom Lane wrote: > In short, I think lock timeout is a solution searching in vain for a > problem. If we implement it, we are just encouraging bad application > design. I agree with Tom completely here. In any real-world application the database is the key component of a larger system: the work it does is the most finicky, and any mistakes (either internally or, more commonly, from misuse) have the most far-reaching consequences. The responsibility of the database is to provide a reliable and easily described and understood mechanism to build on. Timeouts are a system-level mechanism that to be useful must refer to system-level events that are far above anything that PG knows about. The only way PG could apply reasonable timeouts would be for the application to dictate them, but the application can better implement them itself. You can think of this as another aspect of the "end-to-end" principle: any system-level construct duplicated in a lower-level system component can only improve efficiency, not provide the corresponding high-level service. If we have timeouts in the database, they should be there to enable the database to better implement its abstraction, and not pretend to be a substitute for system-level timeouts. There's no upper limit on how complicated a database interface can become (cf. Oracle). The database serves its users best by having the simplest interface that can possibly provide the needed service. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Another news story in need of 'enlightenment'
On Tue, Apr 17, 2001 at 01:31:43PM -0400, Lamar Owen wrote: > This one probably needs the 'iron hand and the velvet paw' touch. The > iron hand to pound some sense into the author, and the velvet paw to > make him like having sense pounded into him. Title of article is 'Open > Source Databases Won't Fly' -- > http://www.dqindia.com/content/enterprise/datawatch/101041201.asp This one is best just ignored. It's content-free, just a his frightened opinions. The only thing that will change his mind is the improvements planned for releases 7.2 and 7.3, and lots of deployments. Few will read his rambling. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: Hey guys, check this out.
On Sun, Apr 15, 2001 at 10:05:46PM -0400, Vince Vielhaber wrote: > On Mon, 16 Apr 2001, Lincoln Yeoh wrote: > > > Maybe you guys should get some Great Bridge marketing/PR person to handle > > stuff like this. > > After reading Ned's comments I figured that's how it got that way in > the first place. But that's just speculation. You probably figured wrong. All those publications have editors who generally feel they're not doing their job if they don't introduce errors, usually without even talking to the reporter. That's probably how the "FreeBSD" reference got in there: somebody saw "Berkeley" and decided "FreeBSD" would look more "techie". It's stupid, but nothng to excoriate the reporter about. Sam Williams's articles read completely differently according to who publishes them. Typically the Linux magazines print what he writes, and thereby get it mostly right, but the finance magazines mangle them to total nonsense. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Fast Forward (fwd)
On Sun, Apr 15, 2001 at 11:44:48AM -0300, The Hermit Hacker wrote: > On Sat, 14 Apr 2001, Nathan Myers wrote: > > > This is probably a good time to point out that this is the _worst_ > > _possible_ response to erroneous reportage. The perception by readers > > will not be that the reporter failed, but that PostgreSQL advocates > > are rabid weasels who don't appreciate favorable attention, and are > > favorable attention?? Yes, totally favorable. There wasn't a hint of the condescension typically accorded free software. All of the details you find so objectionable (April vs. June? "The" marketing arm vs. "a" marketing arm?) would not even be noticed by a non-cultist. > > dangerous to write anything about. You can bet this reporter and her > > editor will treat the topic very circumspectly (i.e. avoid it) in the > > future. > > woo hoo, if that is the result, then I think Vince did us a great service, > not dis-service ... False. This may have been the reporter's and the editor's first direct exposure to free software advocates. You guys came across as hate-filled religious whackos, and that reflects on all of us. > > Most reporters are ignorant, most reporters are lazy, and many are > > both. It's part of the job description. Getting angry about it is > > like getting angry at birds for fouling their cage. Their job is to > > regurgitate what they're given, and quickly. They have no time to > > learn the depths, or to write coherently about it, or even to check > > facts. > > Out of all the articles on PgSQL that I've read over the years, this one > should have been shot before it hit the paper (so to say) ... it was the > most blatantly inaccurate article I've ever read ... It had a number of minor errors, easily corrected. The next will probably talk about what a bunch of nasty cranks and lunatics PostgreSQL fans are, unless you who wrote can display a lot more finesse in your apologies. Thanks a lot, guys. > > It will be harder than the original mailings, but I urge each who > > wrote to write again and apologize for attacking her. > > In a way, I think you are right .. I think the attack was aimed at the > wrong ppl :( She obviously didn't get *any* of her information from ppl > that belong *in* the Pg community, or that have any knowledge of how it > works, or of its history :( How is this reporter going to have developed contacts within the community? She has just started. Now you've burnt her to a crisp, and she will figure the less contact with that "community" she has, the happier she'll be. Her editor will know that mentioning PG in any context will result in a raft of hate mail from cranks, and will treat press releases from our community with the scorn they have earned. Reporters are fragile creatures, and must be gently guided toward the light. They will always get facts wrong, but that matter not at all. The overall tone of the writing is the only thing that stays with their equally dim audience. That dim audience controls the budgets for technology deployment, including databases. Next time you propose a deployment on PG instead of Oracle, thank Vince et al. when it's dismissed as a crank toy. Finally, their talkback page was most probably implemented _not_ with MySQL, but with MS SQL Server. These intramural squabbles (between MySQL and PG, between Linux and BSD, between NetBSD and OpenBSD) are justifiably seen as pathetic in the outside world. Respectful attention among projects doesn't just create a better impression, it also allows you, maybe, to learn something. (MySQL is not objectively as good as PG, but those guys are doing something right, in their presentation, that some of us could learn from.) Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Fast Forward (fwd)
On Sun, Apr 15, 2001 at 01:17:15AM -0400, Vince Vielhaber wrote: > > Here's my response to the inaccurate article cmp produced. After > chatting with Marc I decided to post it myself. > ... > Where do you get your info? Do you just make it up? PostgreSQL is > not a product of Great Bridge and never has been. It's 100% independant. > Is Linux a keyword you figure you can use to draw readers? Won't take > long before folks determine you're full of it. The PostgreSQL team takes > great pride (not to be confused with great bridge) in ensuring that the > work we do runs on ALL platforms; be it Mac's OSX, FreeBSD 4.3, or even > Windows 2000. So why do you figure this is a Great Bridge product? Why > do you figure it's Linux only? What is it with you writers lately? Are > you getting lazy and simply using Linux as a quick out for a paycheck? This is probably a good time to point out that this is the _worst_ _possible_ response to erroneous reportage. The perception by readers will not be that the reporter failed, but that PostgreSQL advocates are rabid weasels who don't appreciate favorable attention, and are dangerous to write anything about. You can bet this reporter and her editor will treat the topic very circumspectly (i.e. avoid it) in the future. When they have to mention it, their reporting will be colored by their personal experience. They (and their readers) don't run the code, so they must get their impressions from those who do. Most reporters are ignorant, most reporters are lazy, and many are both. It's part of the job description. Getting angry about it is like getting angry at birds for fouling their cage. Their job is to regurgitate what they're given, and quickly. They have no time to learn the depths, or to write coherently about it, or even to check facts. None of the errors in the article matter. Nobody will develop an enduring impression of PG from them. What matters is that PG is being mentioned in the same article with Oracle. In her limited way, she did the PG community the biggest favor in her limited power, and all we can do is attack? It will be harder than the original mailings, but I urge each who wrote to write again and apologize for attacking her. Thank her graciously for making an effort, and offer to help her check her facts next time. PostgreSQL needs friends in the press, even if they are ignorant or lazy. It doesn't need any enemies in the press. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Anyone have any good addresses ... ?
On Fri, Apr 13, 2001 at 06:32:26PM -0400, Trond Eivind Glomsr?d wrote: > The Hermit Hacker <[EMAIL PROTECTED]> writes: > > > Here is what we've always sent to to date ... anyone have any good ones > > to add? > > > > > > Addresses : [EMAIL PROTECTED], > > [EMAIL PROTECTED], > > [EMAIL PROTECTED], > > [EMAIL PROTECTED], > > [EMAIL PROTECTED], > > [EMAIL PROTECTED], > > [EMAIL PROTECTED], > > [EMAIL PROTECTED], > > [EMAIL PROTECTED] > > Freshmeat, linuxtoday. If the release includes RPMs for Red Hat Linux, > redhat-announce is also a suitable location. Linux Journal: [EMAIL PROTECTED] Freshmeat: [EMAIL PROTECTED] LinuxToday: http://linuxtoday.com/contribute.php3 -- Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Truncation of object names
On Fri, Apr 13, 2001 at 04:27:15PM -0400, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > > We are thinking about working around the name length limitation > > (encountered in migrating from other dbs) by allowing "foo.bar.baz" > > name syntax, as a sort of rudimentary namespace mechanism. > > Have you thought about simply increasing NAMEDATALEN in your > installation? If you really are generating names that aren't unique > in 31 characters, that seems like the way to go ... We discussed that, and will probably do it (too). One problem is that, having translated "foo.bar.baz" to "foo_bar_baz", you have a problem when you encounter "foo.bar_baz" in subsequent code. I.e., a separate delimiter character helps, even when name length isn't an issue. Also, accepting the names as they appear in the source code already means the number of changes needed is much smaller, even when you don't have true schema support. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Truncation of object names
On Fri, Apr 13, 2001 at 02:54:47PM -0400, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > > Sorry, false alarm. When I got the test case, it turned out to > > be the more familiar problem: > > > create table foo_..._bar1 (id1 ...); > > [notice, "foo_..._bar1" truncated to "foo_..._bar"] > > create table foo_..._bar (id2 ...); > > [error, foo_..._bar already exists] > > create index foo_..._bar_ix on foo_..._bar(id2); > > [notice, "foo_..._bar_ix" truncated to "foo_..._bar"] > > [error, foo_..._bar already exists] > > [error, attribute "id2" not found] > > > It would be more helpful for the first "create" to fail so we don't > > end up cluttered with objects that shouldn't exist, and which interfere > > with operations on objects which should. > > Seems to me that if you want a bunch of CREATEs to be mutually > dependent, then you wrap them all in a BEGIN/END block. Yes, but... The second and third commands weren't supposed to be related to the first at all, never mind dependent on it. They were made dependent by PG crushing the names together. We are thinking about working around the name length limitation (encountered in migrating from other dbs) by allowing "foo.bar.baz" name syntax, as a sort of rudimentary namespace mechanism. It ain't schemas, but it's better than "foo__bar__baz". Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Truncation of object names
On Fri, Apr 13, 2001 at 01:16:43AM -0400, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > > We have noticed here also that object (e.g. table) names get truncated > > in some places and not others. If you create a table with a long name, > > PG truncates the name and creates a table with the shorter name; but > > if you refer to the table by the same long name, PG reports an error. > > Example please? This is clearly a bug. Sorry, false alarm. When I got the test case, it turned out to be the more familiar problem: create table foo_..._bar1 (id1 ...); [notice, "foo_..._bar1" truncated to "foo_..._bar"] create table foo_..._bar (id2 ...); [error, foo_..._bar already exists] create index foo_..._bar_ix on foo_..._bar(id2); [notice, "foo_..._bar_ix" truncated to "foo_..._bar"] [error, foo_..._bar already exists] [error, attribute "id2" not found] It would be more helpful for the first "create" to fail so we don't end up cluttered with objects that shouldn't exist, and which interfere with operations on objects which should. But I'm not proposing that for 7.1. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: Hand written parsers
On Wed, Apr 11, 2001 at 10:44:59PM -0700, Ian Lance Taylor wrote: > Mark Butler <[EMAIL PROTECTED]> writes: > > ... > > The advantages of using a hand written recursive descent parser lie in > > 1) ease of implementing grammar changes > > 2) ease of debugging > > 3) ability to handle unusual cases > > 4) ability to support context sensitive grammars > > ... > > Another nice capability is the ability to enable and disable grammar > > rules at run time ... > > On the other hand, recursive descent parsers tend to be more ad hoc, > they tend to be harder to maintain, and they tend to be less > efficient. ... And I note that despite the > difficulties, the g++ parser is yacc based. Yacc and yacc-like programs are most useful when the target grammar (or your understanding of it) is not very stable. With Yacc you can make sweeping changes much more easily; big changes can be a lot of work in a hand-coded parser. Once your grammar stabilizes, though, hand coding can provide flexibility that is inconceivable in a parser generator, albeit at some cost in speed and compact description. (I doubt parser speed is an issue for PG.) G++ has flirted seriously with switching to a recursive-descent parser, largely to be able to offer meaningful error messages and to recover better from errors, as well as to be able to parse some problematic but conformant (if unlikely) programs. Note that the choice is not just between Yacc and a hand-coded parser. Since Yacc, many more powerful parser generators have been released, one of which might be just right for PG. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Truncation of char, varchar types
On Mon, Apr 09, 2001 at 09:20:42PM +0200, Peter Eisentraut wrote: > Excessively long values are currently silently truncated when they are > inserted into char or varchar fields. This makes the entire notion of > specifying a length limit for these types kind of useless, IMO. Needless > to say, it's also not in compliance with SQL. > > How do people feel about changing this to raise an error in this > situation? Does anybody rely on silent truncation? Should this be > user-settable, or can those people resort to using triggers? Yes, detecting and reporting errors early is a Good Thing. You don't do anybody any favors by pretending to save data, but really throwing it away. We have noticed here also that object (e.g. table) names get truncated in some places and not others. If you create a table with a long name, PG truncates the name and creates a table with the shorter name; but if you refer to the table by the same long name, PG reports an error. (Very long names may show up in machine- generated schemas.) Would patches for this, e.g. to refuse to create a table with an impossible name, be welcome? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Re: TODO list
On Thu, Apr 05, 2001 at 06:25:17PM -0400, Tom Lane wrote: > "Mikheev, Vadim" <[EMAIL PROTECTED]> writes: > >> If the reason that a block CRC isn't on the TODO list is that Vadim > >> objects, maybe we should hear some reasons why he objects? Maybe > >> the objections could be dealt with, and everyone satisfied. > > > Unordered disk writes are covered by backing up modified blocks > > in log. It allows not only catch such writes, as would CRC do, > > but *avoid* them. > > > So, for what CRC could be used? To catch disk damages? > > Disk has its own CRC for this. > > Blocks that have recently been written, but failed to make it down to > the disk platter intact, should be restorable from the WAL log. So we > do not need a block-level CRC to guard against partial writes. If a block is missing some sectors in the middle, how would you know to reconstruct it from the WAL, without a block CRC telling you that the block is corrupt? > A block-level CRC might be useful to guard against long-term data > lossage, but Vadim thinks that the disk's own CRCs ought to be > sufficient for that (and I can't say I disagree). The people who make the disks don't agree. They publish the error rate they guarantee, and they meet it, more or less. They publish a rate that is _just_ low enough to satisfy noncritical requirements (on the correct assumption that they can't satisfy critical requirements in any case) and high enough not to interfere with benchmarks. They assume that if you need better reliability you can and will provide it yourself, and rely on their CRC only as a performance optimization. At the raw sector level, they get (and correct) errors very frequently; when they are not getting "enough" errors, they pack the bits more densely until they do, and sell a higher-density drive. > So the only real benefit of a block-level CRC would be to guard against > bits dropped in transit from the disk surface to someplace else, ie, > during read or during a "cp -r" type copy of the database to another > location. That's not a totally negligible risk, but is it worth the > overhead of updating and checking block CRCs? Seems dubious at best. Vadim didn't want to re-open this discussion until after 7.1 is out the door, but that "dubious at best" demands an answer. See the archive posting: http://www.postgresql.org/mhonarc/pgsql-hackers/2001-01/msg00473.html ... Incidentally, is the page at http://www.postgresql.org/mhonarc/pgsql-hackers/2001-01/ the best place to find old messages? It's never worked right for me. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Re: TODO list
On Thu, Apr 05, 2001 at 02:47:41PM -0700, Mikheev, Vadim wrote: > > > So, for what CRC could be used? To catch disk damages? > > > Disk has its own CRC for this. > > > > OK, this was already discussed, maybe while Vadim was absent. > > Should I re-post the previous text? > > Let's return to this discussion *after* 7.1 release. > My main objection was (and is) - no time to deal with > this issue for 7.1. OK, everybody agreed on that before. This doesn't read like an objection to having it on the TODO list for some future release. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: TODO list
On Thu, Apr 05, 2001 at 02:27:48PM -0700, Mikheev, Vadim wrote: > > If the reason that a block CRC isn't on the TODO list is that Vadim > > objects, maybe we should hear some reasons why he objects? Maybe > > the objections could be dealt with, and everyone satisfied. > > Unordered disk writes are covered by backing up modified blocks > in log. It allows not only catch such writes, as would CRC do, > but *avoid* them. > > So, for what CRC could be used? To catch disk damages? > Disk has its own CRC for this. OK, this was already discussed, maybe while Vadim was absent. Should I re-post the previous text? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: TODO list
On Thu, Apr 05, 2001 at 04:25:42PM -0400, Ken Hirsch wrote: > > > > TODO updated. I know we did number 2, but did we agree on #1 and is > it > > > > done? > > > > > > #2 is indeed done. #1 is not done, and possibly not agreed to --- > > > I think Vadim had doubts about its usefulness, though personally I'd > > > like to see it. > > > > That was my recollection too. This was the discussion about testing the > > disk hardware. #1 removed. > > What is recommended in the bible (Gray and Reuter), especially for larger > disk block sizes that may not be written atomically, is to have a word at > the end of the that must match a word at the beginning of the block. It > gets changed each time you write the block. That only works if your blocks are atomic. Even SCSI disks reorder sector writes, and they are free to write the first and last sectors of an 8k-32k block, and not have written the intermediate blocks before the power goes out. On IDE disks it is of course far worse. (On many (most?) IDE drives, even when they have been told to report write completion only after data is physically on the platter, they will "forget" if they see activity that looks like benchmarking. Others just ignore the command, and in any case they all default to unsafe mode.) If the reason that a block CRC isn't on the TODO list is that Vadim objects, maybe we should hear some reasons why he objects? Maybe the objections could be dealt with, and everyone satisfied. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Re: Final call for platform testing
On Tue, Apr 03, 2001 at 11:19:04PM +, Thomas Lockhart wrote: > > I saw three separate reports of successful builds on Linux 2.4.2 on x86 > > (including mine), but it isn't listed here. > > It is listed in the comments in the real docs. At least one report was > for an extensively patched 2.4.2, and I'm not sure of the true lineage > of the others. You could ask. Just to ignore reports that you have asked for is not polite. My report was based on a virgin, unpatched 2.4.2 kernel, and (as noted) the Debian-packaged glibc-2.2.2. If you are trying to trim your list, would be reasonable to drop Linux-2.0.x, because that version is not being maintained any more. > I *could* remove the version info from the x86 listing, and mention both > 2.2.x and 2.4.x in the comments. Linux-2.2 and Linux-2.4 are different codebases. It is worth noting, besides, the glibc-version tested along with each Linux kernel version. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Final call for platform testing
On Tue, Apr 03, 2001 at 03:31:25PM +, Thomas Lockhart wrote: > > OK. So we are close to a final tally of supported machines. > ... > Here are the up-to-date platforms: > > AIX 4.3.3 RS6000 7.1 2001-03-21, Gilles Darold > BeOS 5.0.4 x86 7.1 2000-12-18, Cyril Velter > BSDI 4.01 x86 7.1 2001-03-19, Bruce Momjian > Compaq Tru64 4.0g Alpha 7.1 2001-03-19, Brent Verner > FreeBSD 4.3 x867.1 2001-03-19, Vince Vielhaber > HPUX PA-RISC 7.1 2001-03-19, 10.20 Tom Lane, 11.00 Giles Lean > IRIX 6.5.11 MIPS 7.1 2001-03-22, Robert Bruccoleri > Linux 2.2.x Alpha 7.1 2001-01-23, Ryan Kirkpatrick > Linux 2.2.x armv4l 7.1 2001-03-22, Mark Knox > Linux 2.0.x MIPS 7.1 2001-03-30, Dominic Eidson > Linux 2.2.18 PPC74xx 7.1 2001-03-19, Tom Lane > Linux 2.2.x S/390 7.1 2000-11-17, Neale Ferguson > Linux 2.2.15 Sparc 7.1 2001-01-30, Ryan Kirkpatrick > Linux 2.2.16 x86 7.1 2001-03-19, Thomas Lockhart > MacOS X Darwin PPC 7.1 2000-12-11, Peter Bierman > NetBSD 1.5 Alpha 7.1 2001-03-22, Giles Lean > NetBSD 1.5E arm32 7.1 2001-03-21, Patrick Welche > NetBSD m68k7.0 2000-04-10 (Henry has lost machine) > NetBSD Sparc 7.0 2000-04-13, Tom I. Helbekkmo > NetBSD VAX 7.1 2001-03-30, Tom I. Helbekkmo > NetBSD 1.5 x86 7.1 2001-03-23, Giles Lean > OpenBSD 2.8 Sparc 7.1 2001-03-23, Brandon Palmer > OpenBSD 2.8 x867.1 2001-03-22, Brandon Palmer > SCO OpenServer 5 x86 7.1 2001-03-13, Billy Allie > SCO UnixWare 7.1.1 x86 7.1 2001-03-19, Larry Rosenman > Solaris 2.7-8 Sparc7.1 2001-03-22, Marc Fournier > Solaris x867.1 2001-03-27, Mathijs Brands > SunOS 4.1.4 Sparc 7.1 2001-03-23, Tatsuo Ishii > WinNT/Cygwin x86 7.1 2001-03-16, Jason Tishler > > And the "unsupported platforms": > > DGUX m88k > MkLinux DR1 PPC750 7.0 2000-04-13, Tatsuo Ishii > NextStep x86 > QNX 4.25 x86 7.0 2000-04-01, Dr. Andreas Kardos > System V R4 m88k > System V R4 MIPS > Ultrix MIPS7.1 2001-03-26, Alexander Klimov > Windows/Win32 x86 7.1 2001-03-26, Magnus Hagander (clients only) I saw three separate reports of successful builds on Linux 2.4.2 on x86 (including mine), but it isn't listed here. -- Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Re: Changing the default value of an inherited column
On Mon, Apr 02, 2001 at 01:27:06PM -0400, Tom Lane wrote: > Philip: the rule that pg_dump needs to apply w.r.t. defaults for > inherited fields is that if an inherited field has a default and > either (a) no parent table supplies a default, or (b) any parent > table supplies a default different from the child's, then pg_dump > had better emit the child field explicitly. The rule above appears to work even if inherited-default conflicts are not taken as an error, but just result in a derived-table column with no default. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Re: Changing the default value of an inherited column
On Sun, Apr 01, 2001 at 03:15:56PM -0400, Tom Lane wrote: > Christopher Masto <[EMAIL PROTECTED]> writes: > > Another thing that seems kind of interesting would be to have: > > CREATE TABLE base (table_id CHAR(8) NOT NULL [, etc.]); > > CREATE TABLE foo (table_id CHAR(8) NOT NULL DEFAULT 'foo'); > > CREATE TABLE bar (table_id CHAR(8) NOT NULL DEFAULT 'bar'); > > Then a function on "base" could look at table_id and know which > > table it's working on. A waste of space, but I can think of > > uses for it. > > This particular need is superseded in 7.1 by the 'tableoid' > pseudo-column. However you can certainly imagine variants of this > that tableoid doesn't handle, for example columns where the subtable > creator can provide a useful-but-not-always-correct default value. A bit of O-O doctrine... when you find yourself tempted to do something like the above, it usually means you're trying to do the wrong thing. You may not have a choice, in some cases, but you should know you are on the way to architecture meltdown. "She'll blow, Cap'n!" Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Re: Changing the default value of an inherited column
On Sat, Mar 31, 2001 at 07:44:30PM -0500, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > >> This seems pretty random. It would be more reasonable if multiple > >> (default) inheritance weren't allowed unless you explicitly specify a new > >> default for the new column, but we don't have a syntax for this. > > > I agree, but I thought the original issue was that PG _does_ now have > > syntax for it. Any conflict in default values should result in either > > a failure, or "no default". Choosing a default randomly, or according > > to an arbitrary and complicated rule (same thing), is a source of > > bugs. > > Well, we *do* have a syntax for specifying a new default (the same one > that worked pre-7.0 and does now again). I guess what you are proposing > is the rule "If conflicting default values are inherited from multiple > parents that each define the same column name, then an error is reported > unless the child table redeclares the column and specifies a new default > to override the inherited ones". > > That is: > > create table p1 (f1 int default 1); > create table p2 (f1 int default 2); > create table c1 (f2 float) inherits(p1, p2); # XXX > > would draw an error about conflicting defaults for c1.f1, but > > create table c1 (f1 int default 3, f2 float) inherits(p1, p2); > > would be accepted (and 3 would become the default for c1.f1). > > This would take a few more lines of code, but I'm willing to do it if > people think it's a safer behavior than picking one of the inherited > default values. I do. Allowing the line marked XXX above, but asserting no default for c1.f1 in that case, would be equally safe. (A warning would be polite, anyhow.) User code that doesn't rely on the default wouldn't notice. You only need to choose a default if somebody adding rows to c1 uses it. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Third call for platform testing (linux 2.4.x)
I just built and tested RC1 on Linux 2.4.2, with glibc-2.2.2 and gcc-2.95.2 on a Debian 2.2+ x86 system. ("+" implying some packages from "unstable".) I configured it --with-perl --with-openssl --with-CXX. It built without errors, but with a few warnings. This one seemed (portably) odd: -- In file included from gram.y:43: lex.plpgsql_yy.c: In function `plpgsql_yylex': lex.plpgsql_yy.c:972: warning: label `find_rule' defined but not used -- And this: -- ar crs libpq.a `lorder fe-auth.o fe-connect.o fe-exec.o fe-misc.o fe-print.o fe-lobj.o pqexpbuffer.o dllist.o pqsignal.o | tsort` tsort: -: input contains a loop: tsort: dllist.o -- And this: -- ar crs libecpg.a `lorder execute.o typename.o descriptor.o data.o error.o prepare.o memory.o connect.o misc.o | tsort` tsort: -: input contains a loop: tsort: connect.o tsort: execute.o tsort: data.o -- And this: -- ar crs libplpgsql.a `lorder pl_parse.o pl_handler.o pl_comp.o pl_exec.o pl_funcs.o | tsort` tsort: -: input contains a loop: tsort: pl_comp.o tsort: pl_parse.o -- I ran "make check". It said: -- All 76 tests passed. -- Nathan Myers [EMAIL PROTECTED] On Sat, Mar 31, 2001 at 12:02:35PM +1200, Franck Martin wrote: > I still don't see an entry for Linux 2.4.x > > Cheers. > > Thomas Lockhart wrote: > > > Unreported or problem platforms: > > > > Linux 2.0.x MIPS 7.0 2000-04-13 (Tatsuo has lost machine) > > mklinux PPC750 7.0 2000-04-13, Tatsuo Ishii > > NetBSD m68k7.0 2000-04-10 (Henry has lost machine) > > NetBSD Sparc 7.0 2000-04-13, Tom I. Helbekkmo > > QNX 4.25 x86 7.0 2000-04-01, Dr. Andreas Kardos > > Ultrix MIPS7.1 2001-??-??, Alexander Klimov > > > > mklinux has failed Tatsuo's testing afaicr. Demote to unsupported? > > > > Any NetBSD partisans who can do testing or solicit testing from the > > NetBSD crowd? Same for OpenBSD? > > > > QNX is known to have problems with 7.1. Any hope of fixing for 7.1.1? Is > > there anyone able to work on it? If not, I'll move to the unsupported > > list. > > > > And here are the up-to-date platforms; thanks for the reports: > > > > AIX 4.3.3 RS6000 7.1 2001-03-21, Gilles Darold > > BeOS 5.0.3 x86 7.1 2000-12-18, Cyril Velter > > BSDI 4.01 x86 7.1 2001-03-19, Bruce Momjian > > Compaq Tru64 4.0g Alpha 7.1 2001-03-19, Brent Verner > > FreeBSD 4.3 x867.1 2001-03-19, Vince Vielhaber > > HPUX PA-RISC 7.1 2001-03-19, 10.20 Tom Lane, 11.00 Giles Lean > > IRIX 6.5.11 MIPS 7.1 2001-03-22, Robert Bruccoleri > > Linux 2.2.x Alpha 7.1 2001-01-23, Ryan Kirkpatrick > > Linux 2.2.x armv4l 7.1 2001-03-22, Mark Knox > > Linux 2.2.18 PPC750 7.1 2001-03-19, Tom Lane > > Linux 2.2.x S/390 7.1 2000-11-17, Neale Ferguson > > Linux 2.2.15 Sparc 7.1 2001-01-30, Ryan Kirkpatrick > > Linux 2.2.16 x86 7.1 2001-03-19, Thomas Lockhart > > MacOS X Darwin PPC 7.1 2000-12-11, Peter Bierman > > NetBSD 1.5 alpha 7.1 2001-03-22, Giles Lean > > NetBSD 1.5E arm32 7.1 2001-03-21, Patrick Welche > > NetBSD 1.5S x867.1 2001-03-21, Patrick Welche > > OpenBSD 2.8 x867.1 2001-03-22, Brandon Palmer > > SCO OpenServer 5 x86 7.1 2001-03-13, Billy Allie > > SCO UnixWare 7.1.1 x86 7.1 2001-03-19, Larry Rosenman > > Solaris 2.7 Sparc 7.1 2001-03-22, Marc Fournier > > Solaris x867.1 2001-03-27, Mathijs Brands > > SunOS 4.1.4 Sparc 7.1 2001-03-23, Tatsuo Ishii > > Windows/Win32 x86 7.1 2001-03-26, Magnus Hagander (clients only) > > WinNT/Cygwin x86 7.1 2001-03-16, Jason Tishler > > > > ---(end of broadcast)--- > > TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) > > > ---(end of broadcast)--- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Re: Changing the default value of an inherited column
On Fri, Mar 30, 2001 at 11:05:53PM +0200, Peter Eisentraut wrote: > Tom Lane writes: > > > 3. The new column will have a default value if any of the combined > > column specifications have one. The last-specified default (the one > > in the explicitly given column list, or the rightmost parent table > > that gives a default) will be used. > > This seems pretty random. It would be more reasonable if multiple > (default) inheritance weren't allowed unless you explicitly specify a new > default for the new column, but we don't have a syntax for this. I agree, but I thought the original issue was that PG _does_ now have syntax for it. Any conflict in default values should result in either a failure, or "no default". Choosing a default randomly, or according to an arbitrary and complicated rule (same thing), is a source of bugs. > > 4. All relevant constraints from all the column specifications will > > be applied. In particular, if any of the specifications includes NOT > > NULL, the resulting column will be NOT NULL. (But the current > > implementation does not support inheritance of UNIQUE or PRIMARY KEY > > constraints, and I do not have time to add that now.) > > This is definitely a violation of that Liskov Substitution. If a context > expects a certain table and gets a more restricted table, it will > certainly notice. Not so. The rule is that the base-table code only has to understand the derived table. The derived table need not be able to represent all values possible in the base table. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Re: Changing the default value of an inherited column
On Fri, Mar 30, 2001 at 12:10:59PM -0500, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > > The O-O principle involved here is Liskov Substitution: if the derived > > table is used in the context of code that thinks it's looking at the > > base table, does anything break? > > I propose the following behavior: > > 1. A table can have only one column of a given name. If the same > column name occurs in multiple parent tables and/or in the explicitly > specified column list, these column specifications are combined to > produce a single column specification. A NOTICE will be emitted to > warn the user that this has happened. The ordinal position of the > resulting column is determined by its first appearance. Treatment of like-named members of multiple base types is not done consistently in the various O-O languages. It's really a snakepit, and anything you do automatically will cause terrible problems for somebody. Nonetheless, for any given circumstances some possible approaches are clearly better than others. In C++, as in most O-O languages, the like-named members are kept distinct. When referred to in the context of a base type, the member chosen is the "right one". Used in the context of the multiply-derived type, the compiler reports an ambiguity, and you are obliged to qualify the name explicitly to identify which among the like-named inherited members you meant. You can declare which one is "really inherited". Some other languages presume to choose automatically which one they think you meant. The real danger is from members inherited from way back up the trees, which you might not know one are there. Of course PG is different from any O-O language. I don't know if PG has an equivalent to the "base-class context". I suppose PG has a long history of merging like-named members, and that the issue is just of the details of how the merge happens. > 4. All relevant constraints from all the column specifications will > be applied. In particular, if any of the specifications includes NOT > NULL, the resulting column will be NOT NULL. (But the current > implementation does not support inheritance of UNIQUE or PRIMARY KEY > constraints, and I do not have time to add that now.) Sounds like a TODO item... Do all the triggers of the base tables get applied, to be run one after another? -- Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Re: Changing the default value of an inherited column
On Thu, Mar 29, 2001 at 02:29:38PM +0100, Oliver Elphick wrote: > Peter Eisentraut wrote: > >Tom Lane writes: > > > >> It seems that in pre-7.0 Postgres, this works: > >> > >> create table one(id int default 1, descr text); > >> create table two(id int default 2, tag text) inherits (one); > >> > >> with the net effect that table "two" has just one "id" column with > >> default value 2. > > > >Although the liberty to do anything you want seems appealing at first, I > >would think that allowing this is not correct from an OO point of view. > > I don't agree; this is equivalent to redefinition of a feature (=method) in > a descendant class, which is perfectly acceptable so long as the feature's > signature (equivalent to column type) remains unchanged. The O-O principle involved here is Liskov Substitution: if the derived table is used in the context of code that thinks it's looking at the base table, does anything break? Changing the default value of a column should not break anything, because the different default value could as well have been entered in the column manually. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] MIPS test-and-set
On Mon, Mar 26, 2001 at 07:09:38PM -0500, Tom Lane wrote: > Thomas Lockhart <[EMAIL PROTECTED]> writes: > > That is not already available from the Irix support code? > > What we have for IRIX is > ... > Doesn't look to me like it's likely to work on anything but IRIX ... I have attached linuxthreads/sysdeps/mips/pt-machine.h from glibc-2.2.2 below. (Glibc linuxthreads has alpha, arm, hppa, i386, ia64, m68k, mips, powerpc, s390, SH, and SPARC support, at least in some degree.) Since the actual instruction sequence is probably lifted from the MIPS manual, it's probably much freer than GPL. For the paranoid, the actual instructions, extracted, are just 1: ll %0,%3 bnez %0,2f li %1,1 sc %1,%2 beqz %1,1b 2: Nathan Myers [EMAIL PROTECTED] --- /* Machine-dependent pthreads configuration and inline functions. Copyright (C) 1996, 1997, 1998, 2000 Free Software Foundation, Inc. This file is part of the GNU C Library. Contributed by Ralf Baechle <[EMAIL PROTECTED]>. Based on the Alpha version by Richard Henderson <[EMAIL PROTECTED]>. The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Library General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. The GNU C Library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public License for more details. You should have received a copy of the GNU Library General Public License along with the GNU C Library; see the file COPYING.LIB. If not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ #include #include #ifndef PT_EI # define PT_EI extern inline #endif /* Memory barrier. */ #define MEMORY_BARRIER() __asm__ ("" : : : "memory") /* Spinlock implementation; required. */ #if (_MIPS_ISA >= _MIPS_ISA_MIPS2) PT_EI long int testandset (int *spinlock) { long int ret, temp; __asm__ __volatile__ ("/* Inline spinlock test & set */\n\t" "1:\n\t" "ll%0,%3\n\t" ".set push\n\t" ".set noreorder\n\t" "bnez %0,2f\n\t" " li %1,1\n\t" ".set pop\n\t" "sc%1,%2\n\t" "beqz %1,1b\n" "2:\n\t" "/* End spinlock test & set */" : "=&r" (ret), "=&r" (temp), "=m" (*spinlock) : "m" (*spinlock) : "memory"); return ret; } #else /* !(_MIPS_ISA >= _MIPS_ISA_MIPS2) */ PT_EI long int testandset (int *spinlock) { return _test_and_set (spinlock, 1); } #endif /* !(_MIPS_ISA >= _MIPS_ISA_MIPS2) */ /* Get some notion of the current stack. Need not be exactly the top of the stack, just something somewhere in the current frame. */ #define CURRENT_STACK_FRAME stack_pointer register char * stack_pointer __asm__ ("$29"); /* Compare-and-swap for semaphores. */ #if (_MIPS_ISA >= _MIPS_ISA_MIPS2) #define HAS_COMPARE_AND_SWAP PT_EI int __compare_and_swap (long int *p, long int oldval, long int newval) { long int ret; __asm__ __volatile__ ("/* Inline compare & swap */\n\t" "1:\n\t" "ll%0,%4\n\t" ".set push\n" ".set noreorder\n\t" "bne %0,%2,2f\n\t" " move %0,%3\n\t" ".set pop\n\t" "sc%0,%1\n\t" "beqz %0,1b\n" "2:\n\t" "/* End compare & swap */" : "=&r" (ret), "=m" (*p) : "r" (oldval), "r" (newval), "m" (*p) : "memory"); return ret; } #endif /* (_MIPS_ISA >= _MIPS_ISA_MIPS2) */ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] Re: RELEASE STOPPER? nonportable int64 constant s in pg_crc.c
On Sat, Mar 24, 2001 at 02:05:05PM -0800, Ian Lance Taylor wrote: > Tom Lane <[EMAIL PROTECTED]> writes: > > Ian Lance Taylor <[EMAIL PROTECTED]> writes: > > > A safe way to construct a long long constant is to do it using an > > > expression: > > > uint64) 0xdeadbeef) << 32) | (uint64) 0xfeedface) > > > It's awkward, obviously, but it works with any compiler. > > > > An interesting example. That will work as intended if and only if the > > compiler regards 0xfeedface as unsigned ... > > True, for additional safety, do this: > uint64) (unsigned long) 0xdeadbeef) << 32) | > (uint64) (unsigned long) 0xfeedface) For the paranoid, uint64) 0xdead) << 48) | (((uint64) 0xbeef) << 32) | \ (((uint64) 0xfeed) << 16) | ((uint64) 0xface)) Or, better #define FRAG64(bits,shift) (((uint64)(bits)) << (shift)) #define LITERAL64(a,b,c,d) \ FRAG64(a,48) | FRAG64(b,32) | FRAG64(c,16) | FRAG64(d,0) LITERAL64(0xdead,0xbeef,0xfeed,0xface) That might be overkill for just a single literal... Nathan Myers ncm ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] WAL & SHM principles
Sorry for taking so long to reply... On Wed, Mar 07, 2001 at 01:27:34PM -0800, Mikheev, Vadim wrote: > Nathan wrote: > > It is possible to build a logging system so that you mostly don't care > > when the data blocks get written [after being changed, as long as they get written by an fsync]; > > a particular data block on disk is > > considered garbage until the next checkpoint, so that you > > How to know if a particular data page was modified if there is no > log record for that modification? > (Ie how to know where is garbage? -:)) In such a scheme, any block on disk not referenced up to (and including) the last checkpoint is garbage, and is either blank or reflects a recent logged or soon-to-be-logged change. Everything written (except in the log) after the checkpoint thus has to happen in blocks not otherwise referenced from on-disk -- except in other post-checkpoint blocks. During recovery, the log contents get written to those pages during startup. Blocks that actually got written before the crash are not changed by being overwritten from the log, but that's ok. If they got written before the corresponding log entry, too, nothing references them, so they are considered blank. > > might as well allow the blocks to be written any time, > > even before the log entry. > > And what to do with index tuples pointing to unupdated heap pages > after that? Maybe index pages are cached in shm and copied to mmapped blocks after it is ok for them to be written. What platforms does PG run on that don't have mmap()? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Uh, this is *not* a 64-bit CRC ...
On Mon, Mar 05, 2001 at 02:00:59PM -0500, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > > The CRC-64 code used in the SWISS-PROT genetic database is (now) at: > > ftp://ftp.ebi.ac.uk/pub/software/swissprot/Swissknife/old/SPcrc.tar.gz > > > From the README: > > > The code in this package has been derived from the BTLib package > > obtained from Christian Iseli <[EMAIL PROTECTED]>. > > From his mail: > > > The reference is: W. H. Press, S. A. Teukolsky, W. T. Vetterling, and > > B. P. Flannery, "Numerical recipes in C", 2nd ed., Cambridge University > > Press. Pages 896ff. > > > The generator polynomial is x64 + x4 + x3 + x1 + 1. > > Nathan (or anyone else with a copy of "Numerical recipes in C", which > I'm embarrassed to admit I don't own), is there any indication in there > that anyone spent any effort on choosing that particular generator > polynomial? As far as I can see, it violates one of the standard > guidelines for choosing a polynomial, namely that it be a multiple of > (x + 1) ... which in modulo-2 land is equivalent to having an even > number of terms, which this ain't got. See Ross Williams' > A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS, available from > ftp://ftp.rocksoft.com/papers/crc_v3.txt among other places, which is > by far the most thorough and readable thing I've ever seen on CRCs. > > I spent some time digging around the net for standard CRC64 polynomials, > and the only thing I could find that looked like it might have been > picked by someone who understood what they were doing is in the DLT > (digital linear tape) standard, ECMA-182 (available from > http://www.ecma.ch/ecma1/STAND/ECMA-182.HTM): > > x^64 + x^62 + x^57 + x^55 + x^54 + x^53 + x^52 + x^47 + x^46 + x^45 + > x^40 + x^39 + x^38 + x^37 + x^35 + x^33 + x^32 + x^31 + x^29 + x^27 + > x^24 + x^23 + x^22 + x^21 + x^19 + x^17 + x^13 + x^12 + x^10 + x^9 + > x^7 + x^4 + x + 1 I'm sorry to have taken so long to reply. The polynomial chosen for SWISS-PROT turns out to be presented, in Numerical Recipes, just as an example of a primitive polynomial of that degree; no assertion is made about its desirability for error checking. It is (in turn) drawn from E. J. Watson, "Mathematics of Computation", vol. 16, pp368-9. Having (x + 1) as a factor guarantees to catch all errors in which an odd number of bits have been changed. Presumably you are then infinitesimally less likely to catch all errors in which an even number of bits have been changed. I would have posted the ECMA-182 polynomial if I had found it. (That was good searching!) One hopes that the ECMA polynomial was chosen more carefully than entirely at random. High-degree codes are often chosen by Monte Carlo methods, by applying statistical tests to randomly-chosen values, because the search space is so large. I have verified that Tom transcribed the polynomial correctly from the PDF image. The ECMA document doesn't say whether their polynomial is applied "bit-reversed", but the check would be equally strong either way. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Internationalized dates (was Internationalized error messages)
On Mon, Mar 12, 2001 at 11:11:46AM +0100, Karel Zak wrote: > On Fri, Mar 09, 2001 at 10:58:02PM +0100, Kaare Rasmussen wrote: > > Now you're talking about i18n, maybe someone could think about input and > > output of dates in local language. > > > > As fas as I can tell, PostgreSQL will only use English for dates, eg January, > > February and weekdays, Monday, Tuesday etc. Not the local name. > > May be add special mask to to_char() and use locales for this, but I not > sure. It isn't easy -- arbitrary size of strings, to_char's cache problems > -- more and more difficult is parsing input with locales usage. > The other thing is speed... > > A solution is use number based dates without names :-( ISO has published a standard on date/time formats, ISO 8601. Dates look like "2001-03-22". Times look like "12:47:63". The only unfortunate feature is their standard format for a date/time: "2001-03-22T12:47:63". To me the ISO date format is far better than something involving month names. I'd like to see ISO 8601 as the default data format. -- Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Banner links not working (fwd)
On Mon, Mar 12, 2001 at 08:05:26PM +, Peter Mount wrote: > At 11:41 12/03/01 -0500, Vince Vielhaber wrote: > >On Mon, 12 Mar 2001, Peter Mount wrote: > > > > > Bottom of every page (part of the template) is both my name and email > > > address ;-) > > > >Can we slightly enlarge the font? > > Can do. What size do you think is best? > > I've always used size=1 for that line... Absolute font sizes in HTML are always a mistake. size="-1" would do. -- Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] doxygen & PG
On Sat, Mar 10, 2001 at 06:29:37PM -0500, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > > Is this page > > http://members.fortunecity.com/nymia/postgres/dox/backend/html/ > > common knowledge? > > Interesting, but bizarrely incomplete. (Yeah, we have only ~100 > struct types ... sure ...) It does say "version 0.0.1". What was interesting to me is that the interface seems a lot more helpful than the current CVS web gateway. If it were to be completed, and could be kept up to date automatically, something like it could be very useful. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] doxygen & PG
Is this page http://members.fortunecity.com/nymia/postgres/dox/backend/html/ common knowledge? It appears to be an automatically-generated cross-reference documentation web site. My impression is that appropriately-marked comments in the code get extracted to the web pages, too, so it is also a way to automate internal documentation. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Internationalized error messages
On Fri, Mar 09, 2001 at 12:05:22PM -0500, Tom Lane wrote: > > Gettext takes care of this. In the source you'd write > > > elog(ERROR, "2200G", gettext("type mismatch in CASE expression (%s vs %s)"), > > string, string); > > Duh. For some reason I was envisioning the localization substitution as > occurring on the client side, but of course we'd want to do it on the > server side, and before parameters are substituted into the message. > Sorry for the noise. > > I am not sure we can/should use gettext (possible license problems?), > but certainly something like this could be cooked up. I've been assuming that PG's needs are specialized enough that the project wouldn't use gettext directly, but instead something inspired by it. If you look at my last posting on the subject, by the way, you will see that it could work without a catalog underneath; integrating a catalog would just require changes in a header file (and the programs to generate the catalog, of course). That quality seems to me essential to allow the changeover to be phased in gradually, and to allow different underlying catalog implementations to be tried out. Nathan ncm ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Internationalized error messages
On Thu, Mar 08, 2001 at 09:00:09PM -0500, Tom Lane wrote: > [EMAIL PROTECTED] (Nathan Myers) writes: > > Similar approaches have been tried frequently, and even enshrined > > in standards (e.g. POSIX catgets), but have almost always proven too > > cumbersome. The problem is that keeping programs that interpret the > > numeric code in sync with the program they monitor is hard, and trying > > to avoid breaking all those secondary programs hinders development on > > the primary program. Furthermore, assigning code numbers is a nuisance, > > and they add uninformative clutter. > > There's a difficult tradeoff to make here, but I think we do want to > distinguish between the "official error code" --- the thing that has > translations into various languages --- and what the backend is actually > allowed to print out. It seems to me that a fairly large fraction of > the unique messages found in the backend can all be lumped under the > category of "internal error", and that we need to have only one official > error code and one user-level translated message for the lot of them. > But we do want to be able to print out different detail messages for > each of those internal errors. There are other categories that might be > lumped together, but that one alone is sufficiently large to force us > to recognize it. This suggests a distinction between a "primary" or > "user-level" error message, which we catalog and provide translations > for, and a "secondary", "detail", or "wizard-level" error message that > exists only in the backend source code, and only in English, and so > can be made up on the spur of the moment. I suggest using different named functions/macros for different categories of message, rather than arguments to a common function. (I.e. "elog(ERROR, ...)" Considered Harmful.) You might even have more than one call at a site, one for the official message and another for unofficial or unstable informative details. > Another thing that I missed in Peter's proposal is how we are going to > cope with messages that include parameters. Surely we do not expect > gettext to start with 'Attribute "foo" not found' and distinguish fixed > from variable parts of that string? The common way to deal with this is to catalog the format string itself, with its embedded % directives. The tricky bit, and what the printf family has had to be extended to handle, is that the order of the formal arguments varies with the target language. The original string is an ordinary printf string, but the translations may have to refer to the substitution arguments by numeric position (as well as type). There is probably Free code to implement that. As much as possible, any compile-time annotations should be extracted into the catalog and filtered out of the source, to be reunited only when you retrieve the catalog entry. > So it's clear that we need to devise a way of breaking an "error > message" into multiple portions, including: > > Primary error message (localizable) > Parameters to insert into error message (user identifiers, etc) > Secondary (wizard) error message (optional) > Source code location > Query text location (optional) > > and perhaps others that I have forgotten about. One of the key things > to think about is whether we can, or should try to, transmit all this > stuff in a backwards-compatible protocol. That would mean we'd have > to dump all the info into a single string, which is doable but would > perhaps look pretty ugly: > > ERROR: Attribute "foo" not found -- basic message for dumb frontends > ERRORCODE: UNREC_IDENT -- key for finding localized message > PARAM1: foo -- something to embed in the localized message > MESSAGE: Attribute or table name not known within context of query > CODELOC: src/backend/parser/parse_clause.c line 345 > QUERYLOC: 22 Whitespace can be used effectively. E.g. only primary messages appear in column 0. PG might emit this, which is easily filtered: Attribute "foo" not found severity: cannot proceed explain: An attribute or table was name not known within explain: the context of the query. index: 237 Attribute \"%s\" not found location: src/backend/parser/parse_clause.c line 345 query_position: 22 Here the first line is the localized replacement of what appears in the code, with arguments substituted in. The other stuff comes from the catalog The call looks like elog_query("Attribute \"%s\" not found", foo); elog_explain("An attribute or table was name not known within"
Re: [HACKERS] Internationalized error messages
On Thu, Mar 08, 2001 at 11:49:50PM +0100, Peter Eisentraut wrote: > I really feel that translated error messages need to happen soon. > Managing translated message catalogs can be done easily with available > APIs. However, translatable messages really require an error code > mechanism (otherwise it's completely impossible for programs to interpret > error messages reliably). I've been thinking about this for much too long > now and today I finally settled to the simplest possible solution. > > Let the actual method of allocating error codes be irrelevant for now, > although the ones in the SQL standard are certainly to be considered for a > start. Essentially, instead of writing > > elog(ERROR, "disaster struck"); > > you'd write > > elog(ERROR, "XYZ01", "disaster struck"); > > Now you'll notice that this approach doesn't make the error message text > functionally dependend on the error code. The alternative would have been > to write > > elog(ERROR, "XYZ01"); > > which makes the code much less clear. Additonally, most of the elog() > calls use printf style variable argument lists. So maybe > > elog(ERROR, "XYZ01", (arg + 1), foo); > > This is not only totally obscure, but also incredibly cumbersome to > maintain and very error prone. One earlier idea was to make the "XYZ01" > thing a macro instead that expands to a string with % arguments, that GCC > can check as it does now. But I don't consider this a lot better, because > the initial coding is still obscured, and additonally the list of those > macros needs to be maintained. (The actual error codes might still be > provided as readable macro names similar to the errno codes, but I'm not > sure if we should share these between server and client.) > > Finally, there might also be legitimate reasons to have different error > message texts for the same error code. For example, "type errors" (don't > know if this is an official code) can occur in a number of places that > might warrant different explanations. Indeed, this approach would > preserve "artistic freedom" to some extent while still maintaining some > structure alongside. And it would be rather straightforward to implement, > too. Those who are too bored to assign error codes to new code can simply > pick some "zero" code as default. > > On the protocol front, this could be pretty easy to do. Instead of > "message text" we'd send a string "XYZ01: message text". Worst case, we > pass this unfiltered to the client and provide an extra function that > returns only the first five characters. Alternatively we could strip off > the prefix when returning the message text only. > > At the end, the i18n part would actually be pretty easy, e.g., > > elog(ERROR, "XYZ01", gettext("stuff happened")); Similar approaches have been tried frequently, and even enshrined in standards (e.g. POSIX catgets), but have almost always proven too cumbersome. The problem is that keeping programs that interpret the numeric code in sync with the program they monitor is hard, and trying to avoid breaking all those secondary programs hinders development on the primary program. Furthermore, assigning code numbers is a nuisance, and they add uninformative clutter. It's better to scan the program for elog() arguments, and generate a catalog by using the string itself as the index code. Those maintaining the secondary programs can compare catalogs to see what has been broken by changes and what new messages to expect. elog() itself can (optionally) invent tokens (e.g. catalog indices) to help out those programs. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Use SIGQUIT instead of SIGUSR1?
On Thu, Mar 08, 2001 at 04:06:16PM -0500, Tom Lane wrote: > To implement the idea of performing a checkpoint after every so many > XLOG megabytes (as well as after every so many seconds), I need to pick > an additional signal number for the postmaster to accept. Seems like > the most appropriate choice for this is SIGUSR1, which isn't currently > being used at the postmaster level. > > However, if I just do that, then SIGUSR1 and SIGQUIT will have > completely different meanings for the postmaster and for the backends, > in fact SIGQUIT to the postmaster means send SIGUSR1 to the backends. > This seems hopelessly confusing. > > I think it'd be a good idea to change the code so that SIGQUIT is the > per-backend quickdie() signal, not SIGUSR1, to bring the postmaster and > backend signals back into some semblance of agreement. > > For the moment we could leave the backends also accepting SIGUSR1 as > quickdie, just in case someone out there is in the habit of sending > that signal manually to individual backends. Eventually backend SIGUSR1 > might be reassigned to mean something else. (I suspect Bruce is > coveting it already ;-).) The number and variety of signals used in PG is already terrifying. Attaching a specific meaning to SIGQUIT may be dangerous if the OS and its daemons also send SIGQUIT to mean something subtly different. I'd rather see a reduction in the use of signals, and a movement toward more modern, better behaved interprocess communication mechanisms. Still, "if it were done when 'tis done, then 'twere well It were done" cleanly. -- Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Proposed WAL changes
On Wed, Mar 07, 2001 at 12:03:41PM -0800, Mikheev, Vadim wrote: > Ian wrote: > > > I feel that the fact that > > > > > > WAL can't help in the event of disk errors > > > > > > is often overlooked. > > > > This is true in general. But, nevertheless, WAL can be written to > > protect against predictable disk errors, when possible. Failing to > > write a couple of disk blocks when the system crashes or, more likely, when power drops; a system crash shouldn't keep the disk from draining its buffers ... > > is a reasonably predictable disk error. WAL should ideally be > > written to work correctly in that situation. > > But what can be done if fsync returns before pages flushed? Just what Tom has done: preserve a little more history. If it's not too expensive, then it doesn't hurt you when running on sound hardware, but it offers a good chance of preventing embarrassments for (the overwhelming fraction of) users on garbage hardware. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] WAL & SHM principles
On Wed, Mar 07, 2001 at 11:21:37AM -0500, Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > The only problem is that we would no longer have control over which > > pages made it to disk. The OS would perhaps write pages as we modified > > them. Not sure how important that is. > > Unfortunately, this alone is a *fatal* objection. See nearby > discussions about WAL behavior: we must be able to control the relative > timing of WAL write/flush and data page writes. Not so fast! It is possible to build a logging system so that you mostly don't care when the data blocks get written; a particular data block on disk is considered garbage until the next checkpoint, so that you might as well allow the blocks to be written any time, even before the log entry. Letting the OS manage sharing of disk block images via mmap should be an enormous win vs. a fixed shm and manual scheduling by PG. If that requires changes in the logging protocol, it's worth it. (What supported platforms don't have mmap?) Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposed WAL changes
On Wed, Mar 07, 2001 at 11:09:25AM -0500, Tom Lane wrote: > "Vadim Mikheev" <[EMAIL PROTECTED]> writes: > >> * Store two past checkpoint locations, not just one, in pg_control. > >> On startup, we fall back to the older checkpoint if the newer one > >> is unreadable. Also, a physical copy of the newest checkpoint record > > > And what to do if older one is unreadable too? > > (Isn't it like using 2 x CRC32 instead of CRC64 ? -:)) > > Then you lose --- but two checkpoints gives you twice the chance of > recovery (probably more, actually, since it's much more likely that > the previous checkpoint will have reached disk safely). Actually far more: if the checkpoints are minutes apart, even the worst disk drive will certainly have flushed any blocks written for the earlier checkpoint. -- Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Red Hat bashing
On Tue, Mar 06, 2001 at 04:20:13PM -0500, Lamar Owen wrote: > Nathan Myers wrote: > > That is why there is no problem with version skew in the syscall > > argument structures on a correctly-configured Linux system. (On a > > Red Hat system it is very easy to get them out of sync, but RH fans > > are used to problems.) > > Is RedHat bashing really necessary here? I recognize that my last seven words above contributed nothing. In the future I will only post strictly factual statements about Red Hat and similarly charged topics, and keep the opinions to myself. I value the collegiality of this list too much to risk it further. I offer my apologies for violating it. By the way... do they call Red Hat "RedHat" at Red Hat? Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] How to shoot yourself in the foot: kill -9 postmaster
On Tue, Mar 06, 2001 at 08:19:12PM +0100, Peter Eisentraut wrote: > Alfred Perlstein writes: > > > Seriously, there's some dispute on the type that 'shm_nattch' is, > > under Solaris it's "shmatt_t" (unsigned long afaik), under FreeBSD > > it's 'short' (i should fix this. :)). > > What I don't like is that my /usr/include/sys/shm.h (through other > headers) has: > > typedef unsigned long int shmatt_t; > > /* Data structure describing a set of semaphores. */ > struct shmid_ds > { > struct ipc_perm shm_perm; /* operation permission struct */ > size_t shm_segsz; /* size of segment in bytes */ > __time_t shm_atime; /* time of last shmat() */ > unsigned long int __unused1; > __time_t shm_dtime; /* time of last shmdt() */ > unsigned long int __unused2; > __time_t shm_ctime; /* time of last change by shmctl() */ > unsigned long int __unused3; > __pid_t shm_cpid; /* pid of creator */ > __pid_t shm_lpid; /* pid of last shmop */ > shmatt_t shm_nattch;/* number of current attaches */ > unsigned long int __unused4; > unsigned long int __unused5; > }; > > whereas /usr/src/linux/include/shm.h has: > > struct shmid_ds { > struct ipc_perm shm_perm; /* operation perms */ > int shm_segsz; /* size of segment (bytes) */ > __kernel_time_t shm_atime; /* last attach time */ > __kernel_time_t shm_dtime; /* last detach time */ > __kernel_time_t shm_ctime; /* last change time */ > __kernel_ipc_pid_t shm_cpid; /* pid of creator */ > __kernel_ipc_pid_t shm_lpid; /* pid of last operator */ > unsigned short shm_nattch; /* no. of current attaches */ > unsigned short shm_unused; /* compatibility */ > void*shm_unused2; /* ditto - used by DIPC */ > void*shm_unused3; /* unused */ > }; > > > Not only note the shm_nattch type, but also shm_segsz, and the "unused" > fields in between. I don't know a thing about the Linux kernel sources, > but this doesn't seem right. On Linux, /usr/src/linux/include is meaningless for anything in userland; it's meant only for building the kernel and kernel modules. That Red Hat tends to expose it to user-level builds is a long-standing bug in Red Hat's distribution, in violation of the File Hierarchy Standard as well as explicit instructions from Linus & crew and from the maintainer of the C library. User-level programs see what's in /usr/include, which only has to match what the C library wants. It's the C library's job to do any mapping needed, and it does. The C library is maintained very, very carefully to keep binary compatibility with all old versions. (One sometimes encounters commercial programs that rely on a bug or undocumented/ usupported feature that disappears in a later library version.) That is why there is no problem with version skew in the syscall argument structures on a correctly-configured Linux system. (On a Red Hat system it is very easy to get them out of sync, but RH fans are used to problems.) Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] How to shoot yourself in the foot: kill -9 postmaster
On Mon, Mar 05, 2001 at 08:55:41PM -0500, Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > killproc should send a kill -15 to the process, wait a few seconds for > > it to exit. If it does not, try kill -1, and if that doesn't kill it, > > then kill -9. > > Tell it to the Linux people ... this is their boot-script code we're > talking about. Not to be a zealot, but this isn't _Linux_ boot-script code, it's _Red Hat_ boot-script code. Red Hat would like for us all to confuse the two, but they jes' ain't the same. (As a rule of thumb, where it works right, credit Linux; where it doesn't, blame Red Hat. :-) Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] WAL & RC1 status
On Fri, Mar 02, 2001 at 10:54:04AM -0500, Bruce Momjian wrote: > > Bruce Momjian <[EMAIL PROTECTED]> writes: > > > Is there a version number in the WAL file? > > > > catversion.h will do fine, no? > > > > > Can we put conditional code in there to create > > > new log file records with an updated format? > > > > The WAL stuff is *far* too complex already. I've spent a week studying > > it and I only partially understand it. I will not consent to trying to > > support multiple log file formats concurrently. > > Well, I was thinking a few things. Right now, if we update the > catversion.h, we will require a dump/reload. If we can update just the > WAL version stamp, that will allow us to fix WAL format problems without > requiring people to dump/reload. I can imagine this would be valuable > if we find we need to make changes in 7.1.1, where we can not require > dump/reload. It Seems to Me that after an orderly shutdown, the WAL files should be, effectively, slag -- they should contain no deltas from the current table contents. In practice that means the only part of the format that *should* matter is whatever it takes to discover that they really are slag. That *should* mean that, at worst, a change to the WAL file format should only require doing an orderly shutdown, and then (perhaps) running a simple program to generate a new-format empty WAL. It ought not to require an initdb. Of course the details of the current implementation may interfere with that ideal, but it seems a worthy goal for the next beta, if it's not possible already. Given the opportunity to change the current WAL format, it ought to be possible to avoid even needing to run a program to generate an empty WAL. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Uh, this is *not* a 64-bit CRC ...
On Wed, Feb 28, 2001 at 09:17:19PM -0500, Bruce Momjian wrote: > > On Wed, Feb 28, 2001 at 04:53:09PM -0500, Tom Lane wrote: > > > I just took a close look at the COMP_CRC64 macro in xlog.c. > > > > > > This isn't a 64-bit CRC. It's two independent 32-bit CRCs, one done > > > on just the odd-numbered bytes and one on just the even-numbered bytes > > > of the datastream. That's hardly any stronger than a single 32-bit CRC; > > > it's certainly not what I thought we had agreed to implement. > > > > > > We can't change this algorithm without forcing an initdb, which would be > > > a rather unpleasant thing to do at this late stage of the release cycle. > > > But I'm not happy with it. Comments? > > > > This might be a good time to update: > > > > The CRC-64 code used in the SWISS-PROT genetic database is (now) at: > > > > ftp://ftp.ebi.ac.uk/pub/software/swissprot/Swissknife/old/SPcrc.tar.gz > > > > From the README: > > > > The code in this package has been derived from the BTLib package > > obtained from Christian Iseli <[EMAIL PROTECTED]>. > > From his mail: > > > > The reference is: W. H. Press, S. A. Teukolsky, W. T. Vetterling, and > > B. P. Flannery, "Numerical recipes in C", 2nd ed., Cambridge University > > Press. Pages 896ff. > > > > The generator polynomial is x64 + x4 + x3 + x1 + 1. > > > > I would suggest that if you don't change the algorithm, at least change > > the name in the sources. Were you to #ifdef in a real crc-64, and make > > a compile-time option to select the old one, you could allow users who > > wish to avoid the initdb a way to continue with the existing pair of > > CRC-32s. > > Added to TODO: > > * Correct CRC WAL code to be normal CRC32 algorithm Um, how about * Correct CRC WAL code to be a real CRC64 algorithm instead? Nathan Myers [EMAIL PROTECTED]
Re: [HACKERS] Uh, this is *not* a 64-bit CRC ...
On Wed, Feb 28, 2001 at 04:53:09PM -0500, Tom Lane wrote: > I just took a close look at the COMP_CRC64 macro in xlog.c. > > This isn't a 64-bit CRC. It's two independent 32-bit CRCs, one done > on just the odd-numbered bytes and one on just the even-numbered bytes > of the datastream. That's hardly any stronger than a single 32-bit CRC; > it's certainly not what I thought we had agreed to implement. > > We can't change this algorithm without forcing an initdb, which would be > a rather unpleasant thing to do at this late stage of the release cycle. > But I'm not happy with it. Comments? This might be a good time to update: The CRC-64 code used in the SWISS-PROT genetic database is (now) at: ftp://ftp.ebi.ac.uk/pub/software/swissprot/Swissknife/old/SPcrc.tar.gz From the README: The code in this package has been derived from the BTLib package obtained from Christian Iseli <[EMAIL PROTECTED]>. From his mail: The reference is: W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, "Numerical recipes in C", 2nd ed., Cambridge University Press. Pages 896ff. The generator polynomial is x64 + x4 + x3 + x1 + 1. I would suggest that if you don't change the algorithm, at least change the name in the sources. Were you to #ifdef in a real crc-64, and make a compile-time option to select the old one, you could allow users who wish to avoid the initdb a way to continue with the existing pair of CRC-32s. Nathan Myers [EMAIL PROTECTED]
Re: [HACKERS] Re: [PATCHES] A patch for xlog.c
On Sun, Feb 25, 2001 at 11:28:46PM -0500, Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > It allows no backing store on disk. I.e. it allows you to map memory without an associated inode; the memory may still be swapped. Of course, there is no problem with mapping an inode too, so that unrelated processes can join in. Solarix has a flag to pin the shared pages in RAM so they can't be swapped out. > > It is the BSD solution to SysV > > share memory. Here are all the BSDi flags: > > > MAP_ANONMap anonymous memory not associated with any specific > > file. The file descriptor used for creating MAP_ANON > > must be -1. The offset parameter is ignored. > > Hmm. Now that I read down to the "nonstandard extensions" part of the > HPUX man page for mmap(), I find > > If MAP_ANONYMOUS is set in flags: > > oA new memory region is created and initialized to all zeros. >This memory region can be shared only with descendants of >the current process. This is supported on Linux and BSD, but not on Solarix 7. It's not necessary; you can just map /dev/zero on SysV systems that don't have MAP_ANON. > While I've said before that I don't think it's really necessary for > processes that aren't children of the postmaster to access the shared > memory, I'm not sure that I want to go over to a mechanism that makes it > *impossible* for that to be done. Especially not if the only motivation > is to avoid having to configure the kernel's shared memory settings. There are enormous advantages to avoiding the need to configure kernel settings. It makes PG a better citizen. PG is much easier to drop in and use if you don't need attention from the IT department. But I don't know of any reason to avoid mapping an actual inode, so using mmap doesn't necessarily mean giving up sharing among unrelated processes. > Besides, what makes you think there's not a limit on the size of shmem > allocatable via mmap()? I've never seen any mmap limit documented. Since mmap() is how everybody implements shared libraries, such a limit would be equivalent to a limit on how much/many shared libraries are used. mmap() with MAP_ANONYMOUS (or its SysV /dev/zero equivalent) is a common, modern way to get raw storage for malloc(), so such a limit would be a limit on malloc() too. The mmap architecture comes to us from the Mach microkernel memory manager, backported into BSD and then copied widely. Since it was the fundamental mechanism for all memory operations in Mach, arbitrary limits would make no sense. That it worked so well is the reason it was copied everywhere else, so adding arbitrary limits while copying it would be silly. I don't think we'll see any systems like that. Nathan Myers [EMAIL PROTECTED]
Re: [HACKERS] CommitDelay performance improvement
On Sun, Feb 25, 2001 at 12:41:28AM -0500, Tom Lane wrote: > Attached are graphs from more thorough runs of pgbench with a commit > delay that occurs only when at least N other backends are running active > transactions. ... > It's not entirely clear what set of parameters is best, but it is > absolutely clear that a flat zero-commit-delay policy is NOT best. > > The test conditions are postmaster options -N 100 -B 1024, pgbench scale > factor 10, pgbench -t (transactions per client) 100. (Hence the results > for a single client rely on only 100 transactions, and are pretty noisy. > The noise level should decrease as the number of clients increases.) It's hard to interpret these results. In particular, "delay 10k, sibs 20" (10k,20), or cyan-triangle, is almost the same as "delay 50k, sibs 1" (50k,1), or green X. Those are pretty different parameters to get such similar results. The only really bad performers were (0), (10k,1), (100k,20). The best were (30k,1) and (30k,10), although (30k,5) also did well except at 40. Why would 30k be a magic delay, regardless of siblings? What happened at 40? At low loads, it seems (100k,1) (brown +) did best by far, which seems very odd. Even more odd, it did pretty well at very high loads but had problems at intermediate loads. Nathan Myers [EMAIL PROTECTED]