Re: [HACKERS] Auto Partitioning
> David Fetter wrote: > > On Fri, Apr 06, 2007 at 09:22:55AM -0700, Joshua D. Drake wrote: > > > >>> The people that use it are the people stuck by dogmatic rules about > >>> "every table must have a primary key" or "every logical constraint > >>> must be protected by a database constraint". Ie, database shops run > >>> by the CYA principle. > >> Or ones that actually believe that every table where possible should > >> have a primary key. > >> > >> There are very, very few instances in good design where a table does > >> not have a primary key. > >> > >> It has nothing to do with CYA. > > > > That depends on what you mean by CYA. If you mean, "taking a > > precaution just so you can show it's not your fault when the mature > > hits the fan," I agree. If you mean, "taking a precaution that will > > actually prevent a problem from occurring in the first place," it > > definitely does. > > Heh, fair enough. When I think of CYA, I think of the former. > > Joshua D. Drake ...I was thinking the point was more on "primary key" as in syntax, as opposed to a table that has a/an attribute(s) that is acknowledged by DML coders as the appropriate way to use the stored data. That is, I may very well _not_ want the overhead of an index of any kind, forced uniqueness, etc, but might also well think of a given attribute as the primary key. Use of constraints in lieu of "primary key" come to mind... 'Course, maybe I missed the point! -smile- 'Nother thought: CYA _can_ have odeous performance costs if over-implemented. It's a matter of using actual use-cases - or observed behavior - to taylor the CYA solution to fit the need without undue overhead. Rgds, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: Commit timestamp
On Fri, 9 Feb 2007, Jan Wieck wrote: > > No matter how many different models you have in parallel, one single > transaction will be either a master, a slave or an isolated local thing. > The proposed changes allow to tell the session which of these three > roles it is playing and the triggers and rules can be configured to fire > during master/local role, slave role, always or never. That > functionality will work for master-slave as well as multi-master. > > Although my current plan isn't creating such a blended system, the > proposed trigger and rule changes are designed to support exactly that > in a 100% backward compatible way. > > Jan Fantastic! ...At some point you'll be thinking of the management end - turning it on or off, etc. That might be where these other points come more into play. Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: Commit timestamp
On Fri, 9 Feb 2007, Andrew Dunstan wrote: > Richard Troy wrote: > > In more specific terms, and I'm just brainstorming in public here, perhaps > > we can use the power of Schemas within a database to manage such > > divisions; commands which pertain to replication can/would include a > > schema specifier and elements within the schema can be replicated one way > > or another, at the whim of the DBA / Architect. For backwards > > compatability, if a schema isn't specified, it indicates that command > > pertains to the entire database. > > I understand that you're just thinking aloud, but overloading namespaces > in this way strikes me as awful. Applications and extensions, which are > the things that have need of namespaces, should not have to care about > replication. If we have to design them for replication we'll be on a > fast track to nowhere IMNSHO. Well, Andrew, replication _is_ an application. Or, you could think of replication as an extension to an application. I was under the impression that_users_ decide to put tables in schema spaces based upon _user_ need, and that Postgres developer's use of them for other purposes was incroaching on user choices, not the other way around. Either way, claiming "need" like this strikes me as stuck-in-a-rut or dogmatic thinking. Besides, don't we have schema nesting to help resolve any such "care?" And, what do you mean by "design them for replication?" While I'm in no way stuck on blending replication strategies via schemas, it does strike me as an appropriate concept and I'd preferr to have it evaluated based on technical merrit - possibly citing workarounds or solutions to technical issues, which is what I gather has been the tradition of this group: Use case first, technical merrit second... Other alternatives, ISTM, will have virtually the same look/feel as a schema from an external perspective, and the more I think of it the more I think using schemas is a sound, clean approach. That it offends someones sense of asthetics STM a poor rationale for not choosing it. Another question might be: What's lacking in the implementation of schemas that makes this a poor choice, and what could be done about it without much effort? Regards, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposal: Commit timestamp
On Fri, 9 Feb 2007, Jan Wieck wrote: > > [ I wrote ] > > It'd be great if Jan considers the blending of replication; > > Please elaborate. I would really like to get all you can contribute. Thanks Jan, prefaced that I really haven't read everything you've written on this (or what other people are doing, either), and that I've got a terrible flu right now (fever, etc), I'll give it a go - hopefully it's actually helpful. To wit: In general terms, "blending of replication [techniques]" means to me that one can have a single database instance serve as a master and as a slave (to use only one set of terminology), and as a multi-master, too, all simultaneously, letting the DBA / Architect choose which portions serve which roles (purposes). All replication features would respect the boundaries of such choices automatically, as it's all blended. In more specific terms, and I'm just brainstorming in public here, perhaps we can use the power of Schemas within a database to manage such divisions; commands which pertain to replication can/would include a schema specifier and elements within the schema can be replicated one way or another, at the whim of the DBA / Architect. For backwards compatability, if a schema isn't specified, it indicates that command pertains to the entire database. At the very least, a schema division strategy for replication leaverages an existing DB-component binding/dividing mechanism that most everyone is familliar with. While there are/may be database-wide, nay, installation- wide constructs as in your Commit Timestamp proposal, I don't see that there's any conflict - at least, from what I understand of existing systems and proposals to date. HTH, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Proposal: Commit timestamp
On Thu, 8 Feb 2007, Joshua D. Drake wrote: > > Well how deep are we talking here? My understanding of what Jan wants to > do is simple. > > Be able to declare which triggers are fired depending on the state of > the cluster. > > In Jan's terms, the Origin or Subscriber. In Replicator terms the Master > or Slave. > > This is useful because I may have a trigger on the Master and the same > trigger on the Slave. You do not want the trigger to fire on the Slave > because we are doing data replication. In short, the we replicate the > result, not the action. > > However, you may want triggers that are on the Slave to fire separately. > A reporting server that generates materialized views is a good example. > Don't tie up the Master with what a Slave can do. > It'd be great if Jan considers the blending of replication; any given DB instance shouldn't be only a master/originator or only a slave/subscriber. A solution that lets you blend replication strategies in a single db is, from my point of view, very important. > > I have no clue what got you into what you are doing here. Jan, some sleep now and then might be helpful to your public disposition. -smile- peace, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: Commit timestamp
> Jan Wieck wrote: > > Are we still discussing if the Postgres backend may provide support for > > a commit timestamp, that follows the rules for Lamport timestamps in a > > multi-node cluster? ...I thought you said in this thread that you haven't and weren't going to work on any kind of logical proof of it's correctness, saw no value in prototyping your way to a clear (convincing) argument, and were withdrawing the proposal due to all the issues others raised which were, in light of this, unanswerable beyond conjecture. I thought that the thread was continuing because other people saw value in the kernel of the idea, would support if if it could be shown to be correct/useful, were disappointed you'd leave it at that and wanted to continue to see if something positive might come of the dialogue. So, the thread weaved around a bit. I think that if you want to nail this down, people here are willing to be convinced, but that hasn't happened yet. On Wed, 7 Feb 2007, Markus Schiltknecht wrote: > I'm only trying to get a discussion going, because a) I'm interested in > how you plan to solve these problems and b) in the past, most people > were complaining that all the different replication efforts didn't try > to work together. I'm slowly trying to open up and discuss what I'm > doing with Postgres-R on the lists. > > Just yesterday at the SFPUG meeting, I've experienced how confusing it > is for the users to have such a broad variety of (existing and upcoming) > replication solutions. And I'm all for working together and probably > even for merging different replication solutions. In support of that idea, I offer this; When Randy Eash wrote the world's first replication system for Ingres circa 1990, his work included ideas and features that are right now in the Postgres world fragmented among several existing replication / replication-related products, along with some things that are only now in discussion in this group. As discussed at the SFPUG meeting last night, real-world use cases are seldom if ever completely satisfied with a one-size-fits-all replication strategy. For example, a manufacturing company might want all factories to be capable of being autonomous but both report activities and take direction from corporate headquarters. To do this without having multiple databases at each site, a single database instance would likely be both a master and slave, but for differing aspects of the businesses needs. Business decisions would resolve the conflicts, say, the manufacturing node always wins when it comes to data that pertains to their work, rather than something like a time-stamp, last timestamp/serialized update wins. Like Markus, I would like to see the various replication efforts merged as best they can be because even if the majority of users don't use a little bit of everything, surely the more interesting cases would like to and the entire community is better served if the various "solutions" are in harmony. Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] The may/can/might business
On Thu, 1 Feb 2007, Bruce Momjian wrote: > From: Bruce Momjian <[EMAIL PROTECTED]> > Tom Lane wrote: > > 3606c3606 > > <errmsg("aggregate function calls cannot be nested"))); > > --- > > >errmsg("aggregate function calls may not be nested"))); > > > > I don't think that this is an improvement, or even correct English. > > > > You have changed a message that states that an action is logically > > impossible into one that implies we are arbitrarily refusing to let > > the user do something that *could* be done, if only we'd let him. > > > > There is relevant material in the message style guidelines, section > > 45.3.8: it says that "cannot open file "%s" ... indicates that the > > functionality of opening the named file does not exist at all in the > > program, or that it's conceptually impossible." > > Uh, I think you might be reading the diff backwards. The current CVS > wording is "cannot". No, Bruce, he got it exactly right: "cannot" indicates, as Tom put it, "logical impossibility," whereas "may not" suggests that something could happen but it's being prevented. His parsing of the english was spot-on. RT -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Updateable cursors
On Wed, 24 Jan 2007, John Bartlett wrote: [regarding optional DBA/SysAdmin logging of Updateable Cursors] > > I can see where you are coming from but I am not sure if a new log entry > would be such a good idea. The result of creating such a low level log could > be to increase the amount of logging by a rather large amount. > Given that logging can be controlled via the contents of postgresql.conf, this sounds like an answer from someone who's never had to support a production environment; Putting a check for log_min_error_statement being set to, say, info, hardly seems like a big burden to me. A casual study of the controls in postgresql.conf reveals we already have many controlls to get things logged when we want/need them - all of which were deemed appropriate previously. So ISTM that if the DBA/SysAdmin thinks they need the information, who are you to tell them, in effect, "No, I don't want you to have to spend any of your machine's performace giving you the information you need?" Help your user by giving them information when they want it. ... Do you argue that this is useless information? Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] DROP FUNCTION failure: cache lookup failed for relation
> It seems a general solution would involve having dependency.c take > exclusive locks on all types of objects (not only tables) as it scans > them and decides they need to be deleted later. And when adding a > pg_depend entry, we'd need to take a shared lock and then recheck to > make sure the object still exists. This would be localized in > dependency.c, but it still seems like quite a lot of mechanism and > cycles added to every DDL operation. And I'm not at all sure that > we'd not be opening ourselves up to deadlock problems. > > I'm a bit tempted to fix only the table case and leave the handling of > non-table objects as is. Comments? > > regards, tom lane The taking of DDL locks is very unlikely to create a performance problem for anyone as DML statements typically far outnumber DDL statements. Further, in my experience, DDL statements are very carefully thought through and are usually either completely automated by well crafted programs or are performed by one person at a time - the DBA. I therefore conclude that any deadlock risk is triflingly small and would be a self-inflicted circumstance. Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal: Commit timestamp
On Thu, 25 Jan 2007, Jan Wieck wrote: > > For a future multimaster replication system, I will need a couple of > features in the PostgreSQL server itself. I will submit separate > proposals per feature so that discussions can be kept focused on one > feature per thread. Hmm... "will need" ... Have you prototyped this system yet? ISTM you can prototype your proposal using "external" components so you can work out the kinks first. Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal: allow installation of any contrib module
FWIW: > * Better packaging support, eg make it easier to add/remove an extension > module and control how pg_dump deals with it. We talked about that > awhile back but nobody did anything with the ideas. +1 > * Better documentation for the contrib modules; some of them are > reasonably well doc'd now, but many are not, and in almost all cases > it's only plain text not SGML. +1 > * Better advertising, for instance make the contrib documentation > available on the website (which probably requires SGML conversion > to happen first...) +1 RT -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Updateable cursors
On Wed, 24 Jan 2007, FAST PostgreSQL wrote: > > We are trying to develop the updateable cursors functionality into > Postgresql. I have given below details of the design and also issues we are > facing. Looking forward to the advice on how to proceed with these issues. > > Rgds, > Arul Shaji > Hi Arul, ...I can see people are picking apart the implementation details so you're getting good feedback on your ambitious proposal. Looks like you've put a lot of thought/work into it. I've never been a fan of cursors because they encourage bad behavior; "Think time" in a transaction sometimes becomes "lunch time" for users and in any event long lock duration is something to be avoided for the sake of concurrency and sometimes performance (vacuum, etc). My philosophy is "get in and get out quick." Ten years ago May, our first customer insisted we implement what has become our primary API library in Java and somewhat later I was shocked to learn that for whatever reason Java ResultSets are supposed to be implemented as _updateable_cursors._ This created serious security issues for handing off results to other programs through the library - ones that don't even have the ability to connect to the target database. Confirmed in the behavior of Informix, we went through some hoops to remove the need to pass ResultSets around. (If I had only known Postgres didn't implement the RS as an updateable cursor, I'd have pushed for our primary platform to be Postgres!) What impresses me is that Postgres has survived so well without updateable cursors. To my mind it illustrates that they aren't widely used. I'm wondering what troubles lurk ahead once they're available. As a DBA/SysAdmin, I'd be quite happy that there existed some kind of log element that indicated updateable cursors were in use that I could search for easily whenever trying to diagnose some performance or deadlocking problem, etc, say log fiile entries that indicated the opening and later closing of such a cursor with an id of some kind that allowed matching up open/close pairs. I also think that that the documentation should be updated to not only indicate usage of this new feature, but provide cautionary warnings about the potential locking issues and, for the authors of libraries, Java in particular, the possible security issues. Regards, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Function execution costs 'n all that
On Mon, 15 Jan 2007, Neil Conway wrote: > On Mon, 2007-01-15 at 10:51 -0800, Richard Troy wrote: > > I therefore propose that the engine evaluate - > > benchmark, if you will - all functions as they are ingested, or > > vacuum-like at some later date (when valid data for testing may exist), > > and assign a cost relative to what it already knows - the built-ins, for > > example. > > That seems pretty unworkable. It is unsafe, for one: evaluating a > function may have side effects (inside or outside the database), so the > DBMS cannot just invoke user-defined functions at whim. Also, the > relationship between a function's arguments and its performance will > often be highly complex -- it would be very difficult, not too mention > computationally infeasible, to reconstruct that relationship > automatically, especially without any real knowledge about the > function's behavior. > > -Neil Hi Neil, Tom had already proposed: > > I'm envisioning that the CREATE FUNCTION syntax would add optional > clauses > >COST function-name-or-numeric-constant >ROWS function-name-or-numeric-constant > > that would be used to fill these columns. I was considering these ideas in the mix; let the user provide either a numeric or a function, the distinction here being that instead of running that function at planning-time, it could be run "off-line", so to speak. Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Function execution costs 'n all that
On Mon, 15 Jan 2007, Tom Lane wrote: > So I've been working on the scheme I suggested a few days ago of > representing "equivalence classes" of variables explicitly, and avoiding > the current ad-hocery of generating and then removing redundant clauses > in favor of generating only the ones we want in the first place. Any > clause that looks like an equijoin gets sent to the EquivalenceClass > machinery by distribute_qual_to_rels, and not put into the > restrictlist/joinlist data structure at all. Then we make passes over > the EquivalenceClass lists at appropriate times to generate the clauses > we want. This is turning over well enough now to pass the regression > tests, That was quick... > In short, this approach results in a whole lot less stability in the > order in which WHERE clauses are evaluated. That might be a killer > objection to the whole thing, but on the other hand we've never made > any strong promises about WHERE evaluation order. Showing my ignorance here, but I've never been a fan of "syntax based optimization," though it is better than no optimization. If people are counting on order for optimization, then, hmmm... If you can provide a way to at least _try_ to do better, then don't worry about it. It will improve with time. > Instead, I'm thinking it might be time to re-introduce some notion of > function execution cost into the system, and make use of that info to > sort WHERE clauses into a reasonable execution order. Ingres did/does it that way, IIRC. It's a solid strategy. > This example > would be fixed with even a very stupid rule-of-thumb about SQL functions > being more expensive than C functions, but if we're going to go to the > trouble it seems like it'd be a good idea to provide a way to label > user-defined functions with execution costs. > > Would a simple constant value be workable, or do we need some more > complex model (and if so what)? Ingres would, if I'm not mistaken, gain through historical use through histograms. Short of that, you've got classes of functions, agregations, for example, and there's sure to be missing information to make a great decision at planning time. However, I take it that the cost here is primarily CPU and not I/O. I therefore propose that the engine evaluate - benchmark, if you will - all functions as they are ingested, or vacuum-like at some later date (when valid data for testing may exist), and assign a cost relative to what it already knows - the built-ins, for example. Doing so could allow this strategy to be functional in short order and be improved with time so all the work doesn't have to be implemented on day 1. And, DBA/sys-admin tweaking can always be done by updating the catalogues. HTH, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.
On Thu, 11 Jan 2007, Tom Lane wrote: ...snip... > > (You know, of course, that my opinion is that no sane person would run a > production database on Windows in the first place. So the data-loss > risk to me seems less of a problem than the unexpected-failures problem. > It's not like there aren't a ton of other data-loss scenarios in that OS > that we can't do anything about...) > > regards, tom lane > PLEASE OH PLEASE document every f-ing one of them! (And I don't mean document Windows issues as comments in the source code. Best would be in the official documentation/on a web page.) On occasion, I could *really* use such a list! (If such already exists, please point me at it!) Thing is, Tom, not everybody has the same level of information you have on the subject... Regards, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] ideas for auto-processing patches
On Wed, 10 Jan 2007, Jim C. Nasby wrote: > > On Thu, Jan 11, 2007 at 08:04:41AM +0900, Michael Glaesemann wrote: > > >Wouldn't there be some value to knowing whether the patch failed > > >due to > > >bitrot vs it just didn't work on some platforms out of the gate? > > > > I'm having a hard time figuring out what that value would be. How > > would that knowledge affect what's needed to fix the patch? > > I was thinking that knowing it did work at one time would be useful, but > maybe that's not the case... > "Has it ever worked" is the singularly most fundamental technical support question; yes, it has value. One question here - rhetorical, perhaps - is; What changed and when? Often when things changed can help get you to what changed. (This is what logs are for, and not just automated computer logs, but system management things like, "I upgraded GCC today.") And that can help you focus in on what to do to fix the problem. (such as looking to the GCC release notes) A non-rhetorical question is; Shouldn't the build process mechanism/system know when _any_ aspect of a build has failed (including patches)? I'd think so, especially in a build-farm scenario. ...Just my two cents - and worth every penny! -smile- Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] "recovering prepared transaction" after serverrestart
On Fri, 3 Nov 2006, Tom Lane wrote: > > > Is there a way to see prepared transactions where the original session > > that prepared then has died? Perhaps the message at startup should be > > "you have at least one prepared transaction that needs resolution". > > I am completely baffled by this focus on database startup time. That's > not where the problem is. > > regards, tom lane > I'm not alluding to anyone in particular, just responding to the focus on startup time; When I joined Ingres as a Consultant (back when that was a revered job), we saw this a lot, too, bubbling through the ranks from technical support. Engineering was having a cow over it. We Consultants were expected to backline such problems and be the interface between engineering and the rest of the world. What we found was that in what we'd call the ligitimate cases, the cause for concern over startup time had to do with bugs that forced, one way or another, a server restart. Illigitimate cases - the VAST majority - were the result of, well, let's call them less-than-successful DBAs, thrashing their installations with their management breathing down their necks, often with flailing arms and fire coming out of their mouths saying things like, "I bet my business on this!"... The usual causes there were inappropriate configurations, and a critical cause of _that_ was an instalation toolset that didn't help people size/position things properly. Often a sales guy or trainee would configure a test system and then the customer would put that into production without ever reexamining the settings. I realized there was an opportunity here; I put together a training program and we sold it as a service along with installation to new customers to help them get off on the right foot. Once we did that, new customers were essentially put on notice that they could either pay us to help set them up, or they could do it, but that continuing along with what the salesman or junior techie had done wasn't sufficient for a production environment that you could bet your business on. ...The complaint and concern about startup time dropped out of sight nearly immediately... Opportunity here, for PostgreSql: A Technical Document of some kind entitled something like: "How to move your testing environment into production." No, unfortunately, I can't volunteer to be the point person on this one. And to the underlying question: is this the case with PostgreSql? I can't say... Regards, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Design Considerations for New Authentication Methods
> Username/password is not acceptable in a number of situations. This is > not intended to replace them. This would be in *addition* to supporting > the current auth methods. I don't understand at all how you feel it'd > be nice to have yet shouldn't be done. > >Thanks, > >Stephen ...I thought you said this _needs_ to be done - by using words like "unacceptible" and "required" - and I disagree. There's a difference between what needs to be done and what is desired to be done. Further, I never said "shouldn't." Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Design Considerations for New Authentication Methods
On Thu, 2 Nov 2006, Magnus Hagander wrote: > > > > I expect we'll need a mapping of some sort, or perhaps a > > sasl_regexp or similar to what is done in OpenLDAP. I don't > > recall PG supporting using the DN from a client cert in an > > SSL connection as a PG username but perhaps I missed it somewhere... > > You can't today. > If we want to add username mapping in SASL or whatever, it might be a > good idea to look at generalizing the authuser-to-dbuser mapping stuff > (like we have for identmap now) into something that can be used for all > external auth methods. Instead of inventing one for every method. > > //Magnus Well, there's simply no need. While I can agree that more could be done, I'm not convinced there's a need because what we have now works fine. Let me support my view by stating first that I perceive that combining the conception of encrypting a communications channel with user authentication to be a very poor choice. I gather from the paragraph above that this is a forgone conclusion. Appologies if I'm mistaken. Just so my point - that another strategy is not needed - is understood, let's agree that SSL is just preventing sniffers from capturing whatever else goes on in "our conversation." Great. What's inside that communication? Why, there's a perfectly workable username/password authentication that happens! Sure, someone could steal that data somehow and break in, but that requires one of the two systems to be breached, and that's a security problem that's out of scope for Postgres. Would signed certificates be preferred? Well, sure, they're nice. I don't object, and in fact welcome some improvements here. For example, I'd love the choice of taking an individual user's certificate and authenticating completely based upon that. However, while this _seems_ to simplify things, it really just trades off with the added cost of managing those certs - username/password is slam-dunk simple and has the advantage that users can share one authentication. Unless I've really overlooked something basic, there's nothing lacking in the existing scheme... Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] bug in on_error_rollback !?
Gurjeet, I see that the question of case sensitivity in psql is still being discussed. "I don't have a dog in that fight," but thought I might make a suggestion. To wit: I propose you adopt the standard that I personally have adopted eons ago - literally perhaps 20 years ago - and has by now saved me many days of time, I'm sure; ALWAYS presume case sensitivity and code _exactly_ that way every time. (And develop you're own capitalization standard, too, so you'll always know which way it goes.) You'll never be disappointed that way and you won't create hidden bugs. If you want to keep arguing that Postgres should change to meet your expectations, fine, and if it changes, great for you, but you'll just have the same problem someday with some other package - better you change your habits instead! Richard On Sat, 28 Oct 2006, Gurjeet Singh wrote: > Date: Sat, 28 Oct 2006 20:01:00 +0530 > From: Gurjeet Singh <[EMAIL PROTECTED]> > To: Peter Eisentraut <[EMAIL PROTECTED]> > Cc: pgsql-hackers@postgresql.org, Andrew Dunstan <[EMAIL PROTECTED]>, > Bernd Helmle <[EMAIL PROTECTED]> > Subject: Re: [HACKERS] bug in on_error_rollback !? > > On 10/27/06, Peter Eisentraut <[EMAIL PROTECTED]> wrote: > > > > In psql, the psql > > parts follow the syntax rules of psql, the SQL parts follow the syntax > > rules of SQL. The syntax rules of psql in turn are inspired by Unix > > shells, sort of because psql is used that way. (Surely one wouldn't > > want the argument to \i be case-insensitive?) > > > A very good reasoning... I completely agree... > > But you'd also agree that since the psql variables can (and most often they > are) used in SQL satements, we should consider making atleast \set case > insensitive! > > postgres=# \set x 1 > postgres=# select :x; > ?column? > -- > 1 > (1 row) > > postgres=# select :X; > ERROR: syntax error at or near ":" > LINE 1: select :X; >^ > postgres=# > > > what harm allowing "\set on_error_rollback" would be: it certainly > won't break any existing scripts. > ... > I wrote this feature (but someone else > chose the name!) and I still occasionally write it lowercase and wonder > why it isn't working. :) > > > I agree, we can't make every '\' command case-insensitive, but a few, > where it makes absolute sense, should be subject to reconsideration. We have > the choice of making it more user-friendly, and less confusing. > > > > -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Replication documentation addition
On Wed, 25 Oct 2006, Bruce Momjian wrote: ...snip... > > > Data partitioning is often done within a single database on a single > > server and therefore, as a concept, has nothing whatsoever to do with > > different servers. Similarly, the second paragraph of this section is > > Uh, why would someone split things up like that on a single server? > > > problematic. Please define your term first, then talk about some > > implementations - this is muddying the water. Further, there are both > > vertical and horizontal partitioning - you mention neither - and each has > > its own distinct uses. If partitioning is mentioned, it should be more > > complete. > > Uh, what exactly needs to be defined. OK, "Data partitioning"; data partitioning begins in the RDB world with the very notion of tables, and we partition our data during schema development with the goal of "normalizing" the design - "thrid normal form" being the one most Professors talk about as a target. "Data partitioning", then, is the intentional denormalization of the design to accomplish some goal(s) - not all of which are listed in this document's title. In this context, data partitioning takes two forms based upon which axis of a two-dimensional table is to be divided, with the vertical partition dividing attributes (as in a master/detail relationship with one-to-one mapping), and the horizontal partition dividing based on one or more attributes domain, or value (as in your example of London records being kept in a database in London, while Paris records are kept in Paris). The point I was making was that that section of the document was in err because it presumed there was only one form of data partitioning and that it was horizontal. (The document is now missing, so I can't look at the current content - it was here: ftp://momjian.us/pub/postgresql/mypatches/replication.) In answer to your query about why someone would use such partitioning, the nearly universal answer is performance, and the distant second answer is security. In one example that comes immediately to mind, there is a table which is a central core of an application, and, as such, there's a lot to say about the items in this table. The table's size is in the tens to hundreds of millions of rows, and needs to be joined with something else in a huge fraction of queries. For performance reasons, the tables size was therefore kept as tiny as possible and detail table(s) is(are) used for the remaining attributes that logically belong in the table - it's a vertical partition. It's an exceptionally common technique - so common, it probably didn't occur to you that you were even talking about it when you spoke of "data partitioning." > > Next, Query Broadcast Load Balancing... also needs a lot of work. First, > > it's foremost in my memory that sending read queries everywhere and > > returning the first result set back is a key way to improve application > > performance at the cost of additional load on other systems - I guess > > that's not at all what the document is after here, but it's a worthy part > > of a dialogue on broadcasting queries. In other words, this has more parts > > to it than just what the document now entertains. Secondly, the document > > Uh, do we want to go into that here? I guess I could. > > > doesn't address _at_all_ whether this is a two-phaise-commit environment > > or not. If not, how are updates managed? If each server operates > > independently and one of them fails, what do you do then? How do you know > > _any_ server got an insert/update? ... Each server _can't_ operate > > independently unless the application does its own insert/update commits to > > every one of them - and that can't be fast, nor does it load balance, > > though it may contribute to superior uptime performance by the > > application. > > I think having the application middle layer do the commits is how it > works now. Can someone explain how pgpool works, or should we mention > how two-phase commit has to be done here? pgpool2 has additional > features. Well, you hadn't mentioned two phaise commit at all and it surely belong somewhere in this document - it's a core PG feature and enables a lot of alternative solutions which the document discusses. What it needs to say but doesn't (didn't?) is that the load from read queries can be distributed for load balancing purposes but that there's no benefit possible for writes, and that replication overhead costs could possibly overwhelm the benefits in high-update scenarios. The point that each server operates independently is only true if you ignore the the necessary replication - which, to my mind, links the sys
Re: [DOCS] [HACKERS] Replication documentation addition
> The documentation comes with the open source tarball. Yuck. > > I would welcome if the docs point to an unofficial wiki (maintained > externally from authoritative PostgreSQL developers) or a website > listing them and giving a brief of each solution. > > postgresql.org already does this for events (commercial training!) and > news. Point to postgresql.org/download/commercial as there *already* are > brief descriptions, pricing and website links. I wouldn't have looked in "download" for such a thing. Nor would I expect everyone with a Postgres related solution to want to post it on PosgreSql.org for download. However I agree that a simple web page listing such things is needed. It's easy to manage - way easier to manage than the development of a competent relational database engine! It's just a bunch of text, after all, and errors aren't that critical and will tend to self-correct through user attention. > > > > You list the ones that are stable in their existence (commercial or not). > > > And how would you determine it? Years of existance? Contribution to > PostgreSQL's source code? It is not easy and wouldn't be fair. There are > ones that certainly will be listed, and other doubtful ones (which would > perhaps complain, that's why I said 'all' - if they are not stable, > either they stay out of the market or fix their problems). You have to just trust people. If it's clear that "this isn't PostgreSql.org", stuff can be unstable, etc - it isn't the group's problem. > > No it doesn't. Because there is always the, "It want's to be free!" crowd. > > > Yes, I agree there are. But also development in *that* cutting-edge is > scarce. It feels that something had filled the gap if you list some > commercial solution, mainly people in the trenches (DBAs). They would, > obviously, firstly seek the commercial solutions as they are interested. > So they click 'commercial products' in the main website. Not necessarily. Most times, I'll seek the better solution, which may or may not be commercial. Sometimes I'll avoid a commercial version because I don't like the company! ... But getting genuine donations of time - without direct $$ self-interest attached, is a whole nother kettle o fish. For example, there are a lot of students out there that are excellent and would love to have a mechanism to gain something for their resumes before entering the business world. ...There might be some residual interest at UCB, for example. Attracting this kind of support is a completely different dialogue, but on _this_ topic, surely seeking the "it wants to be free!" crowd can't (or shouldn't, in my view) be used as an excuse for not publishing pointers to commercial soltions that involve PostgreSql. Do it already! > >> If people (who read the documentation) professionally work with > >> PostgreSQL, they may already have been briefed by those commercial > >> offerings in some way. > >> > > > > Maybe, maybe not. The "may" is a wiggler; sounds like an excuse with a back door. The real answer is "probably not!" I'm in that world. I haven't been briefed. Ever. > And I agree with your point, still. However, that would open a precedent > for people to have to maintain lists of stable software in every > documentation area. All that's needed is ONE list, with clear disclaimer. It'll be all text and links, and maybe the odd small .gif logo, if permitted, so it won't be a huge thing. Come on now, are there thousands of such products? Tens sounds more plausible. Regards, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Replication documentation addition
On Wed, 25 Oct 2006, Josh Berkus wrote: > > Bruce, > > > It isn't designed for that. It is designed for people to understand > > what they want, and then they can look around for solutions. I think > > most agree we don't want a list of solutions in the documentation, > > though I have a few as examples. > > Do they? I've seen no discussion of the matter. I think we should have > them. > > I completely agree; If you want to attract competent people from the business world, one thing you have to do is respect their time by helping them find information, especially about things they don't know exist. All that's needed are pointers, but the pointers need to be to solid documents/resources, not just the top of a heap - if you'll forgive the pun. Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Replication documentation addition
efine the term (perhaps in a little more detail), and not mention solutions - they change with time anyway. While I've never used Oracle's clustering tools, I've read up on them and have customers who use them, and I think this description of Oracle clustering is a mis-read on what the Oracle system actually does. A check with a true Oracle clustering expert is in order here. Hope this helps. If asked, I'm willing to (re)write some of the bits discussed above. Regards, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Replication documentation addition
st. By including more information, more users will be attracted to PostgreSql, whether it be in the documentation or web site. I have been SURE that certain things must exist in the PG world, but haven't known about them with certainty due to time constraints, but would gladly point our customers at Postgres solutions if only I knew about them. Count this paragraph as praise for doing _something_more_ to help get more information to (prospective) users. Consider someone like me; my company supports five RDBMSes, one of them being Postgres. We are probably not unique in that we've written an SQL dialect translator so we could write our own code in one code line to run anywhere, against any RDBMS (it can learn new dialects) - or perhaps others keep multiple code lines containing varriant dialects. Either way, we "don't care" whether our customer has Oracle, or PostgreSql, so long as they buy our stuff. But when our customers - or prospects - come to us with a given scenario, the more we know about Postgres - and its community - the more likely we can steer them to a PG solution, which we would prefer anyway, for lots of reasons, historical, personal, and technical - not to mention cost. The trouble is, Oracle, for example, has already told them (sold them?) on whatever, and we need a rebuttal ready at hand or they'll go with Oracle. We just don't have the time to fight that battle, nor do we wish to risk the sale when we can work with Oracle just fine. In sum, I agree with Tom Lane and the others who chimed in with "keep the docs clean, use the web site for mentioning other projects/products." And again I applaud this new effort. Regards, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org