Re: [HACKERS] [WIP] collation support revisited (phase 1)
On Mon, Jul 21, 2008 at 03:15:56AM +0200, Radek Strnad wrote: I was trying to sort out the problem with not creating new catalog for character sets and I came up following ideas. Correct me if my ideas are wrong. Since collation has to have a defined character set. Not really. AIUI at least glibc and ICU define a collation over all possible characters (ie unicode). When you create a locale you take a subset and use that. Think about it: if you want to sort strings and one of them happens to contain a chinese charater, it can't *fail*. Note strcoll() has no error return for unknown characters. I'm suggesting to use already written infrastructure of encodings and to use list of encodings in chklocale.c. Currently databases are not created with specified character set but with specified encoding. I think instead of pointing a record in collation catalog to another record in character set catalog we might use only name (string) of the encoding. That's reasonable. From an abstract point of view collations and encodings are orthoginal, it's only when you're using POSIX locales that there are limitations on how you combine them. I think you can assume a collation can handle any characters that can be produced by encoding. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Please line up in a tree and maintain the heap invariant while boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
[HACKERS] Postgres-R: tuple serialization
Hi, yesterday, I promised to outline the requirements of Postgres-R for tuple serialization, which we have been talking about before. There are basically three types of how to serialize tuple changes, depending on whether they originate from an INSERT, UPDATE or DELETE. For updates and deletes, it saves the old pkey as well as the origin (a global transaction id) of the tuple (required for consistent serialization on remote nodes). For inserts and updates, all added or changed attributes need to be serialized as well. pkey+originchanges INSERT-x UPDATExx DELETEx- Note, that the pkey attributes may never be null, so an isnull bit field can be skipped for those attributes. For the insert case, all attributes (including primary key attributes) are serialized. Updates require an additional bit field (well, I'm using chars ATM) to store which attributes have changed. Only those should be transferred. I'm tempted to unify that, so that inserts are serialized as the difference against the default vaules or NULL. That would make things easier for Postgres-R. However, how about other uses of such a fast tuple applicator? Does such a use case exist at all? I mean, for parallelizing COPY FROM STDIN, one certainly doesn't want to serialize all input tuples into that format before feeding multiple helper backends. Instead, I'd recommend letting the helper backends do the parsing and therefore parallelize that as well. For other features, like parallel pg_dump or even parallel query execution, this tuple serialization code doesn't help much, IMO. So I'm thinking that optimizing it for Postgres-R's internal use is the best way to go. Comments? Opinions? Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Am Monday, 21. July 2008 schrieb Tom Lane: So my feeling is that we should not accept either of these patches. My feeling as well. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] overlaps performance
Grzegorz Jaśkiewicz [EMAIL PROTECTED] writes: Tom Lane pisze: The reason we don't automatically translate OVERLAPS is that the spec's definition of OVERLAPS is too weird for that to work; in particular it demands a true result for some cases in which one of the four endpoints is NULL, which'd be pretty hard to do with an interval-style index. shame, I just work on a thing that would benefit from index that could be used in OVERLAPS. I don't know psql internals , except for how GiST works, hence my question. Ah, but the transformation given is actually a bit of a red herring. If you look at the plan it's doing two bitmap index scans which together are actually effectively doing a full index scan. The benefit comes from applying the full overlap condition to the index tuples and only scanning the heap for matching tuples. Presumably this index is much smaller than the table and/or cached in memory so the random accesses are outweighed by the lower i/o. This does raise the possibility that we should check for index scan paths if we have selective enough columns even if the pathkeys aren't a prefix of the index pathkeys. We would have to do a full index scan but the cost might still be lower. I think the reason we don't (aside from it not being at all useful in he past) is that it would lead to a lot of possible index scans being considered. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: primary key patches
[EMAIL PROTECTED] (Markus Wanner) writes: chris wrote: I agree with you that tables are *supposed* to have primary keys; that's proper design, and if tables are missing them, then something is definitely broken. Ah, I see, so you are not concerned about tables with a PRIMARY KEY for which one wants another REPLICATION KEY, but only about tables without a PRIMARY KEY, for which one doesn't want a PRIMARY KEY in the first place. Doesn't want is probably overstating the matter. I'll describe a scenario to suggest where it might happen. - A system is implemented, using the database, and, for some reason, no PRIMARY KEY is defined for a table. Someone forgot; it got misconfigured; a mistake was probably made. - The system then goes into production, and runs for a while. The table has data added to it, and starts to grow rather large. - At THIS point, we decide to introduce replication, only to discover that there isn't a PRIMARY KEY on the table. Ideally, we'd take an outage and add the primary key. But suppose we can't afford to do so? The add indexes concurrently added in 8.3 (if memory serves) *would* allow us to create a *candidate* primary key without forcing an outage. In theory, we'd like to have a true primary key. Sometimes operational issues get in the way. However, that's a general limitation of replication at tuple level: you need to be able to uniquely identify tuples. (Unlike replication on storage level, which can use the storage location for that). No disagreement; yes, we certainly do need a way to uniquely identify tuples, otherwise we can't replicate UPDATE or DELETE. Sometimes, unfortunately, people make errors in design, and we wind up needing to accomodate situations that are less than perfect. The happy happenstance is that, in modern versions of PostgreSQL, a unique index may be added in the background so that this may be rectified without outage if you can live with a candidate primary key rather than a true PRIMARY KEY. I cannot see any reason for not wanting a PRIMARY KEY, but wanting replication, and therefore a REPLICATION KEY. Or are you saying we should add a hidden REPLICATION KEY for people who are afraid of schema changes and dislike a visible primary key? Would you want to hide the underlying index as well? The scenario I outline above hopefully answers this. It's not a matter that I expect people to specifically desire not to have a primary key. Instead, I expect cases where mistakes compound with operational issues to make them say Ow - I can't do that now! -- select 'cbbrowne' || '@' || 'linuxfinances.info'; http://cbbrowne.com/info/lsf.html Rules of the Evil Overlord #145. My dungeon cell decor will not feature exposed pipes. While they add to the gloomy atmosphere, they are good conductors of vibrations and a lot of prisoners know Morse code. http://www.eviloverlord.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pltcl_*mod commands are broken on Solaris 10
Commands pltcl_listmod, pltcl_delmod, pltcl_loadmod does not work on Solaris 10, because there is not present tclsh. Unfortunately build process substitute path to shell with empty string which invokes infinite loop. See diff between S10 and Nevada: bash-3.00# head /usr/postgres/8.3/bin/pltcl_listmod #! /bin/sh # $PostgreSQL: pgsql/src/pl/tcl/modules/pltcl_listmod.in,v 1.3 2006/03/11 04:38:40 momjian Exp $ # # Start tclsh \ exec $0 $@ - bash-3.2$ head pltcl_listmod #! /bin/sh # $PostgreSQL: pgsql/src/pl/tcl/modules/pltcl_listmod.in,v 1.3 2006/03/11 04:38:40 momjian Exp $ # # Start tclsh \ exec /usr/bin/tclsh $0 $@ By main opinion main problem is in build process which does not fail and also dependency on tclsh is hidden by exec command. Any idea how to fix it? Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent VACUUM and ANALYZE
On Tue, Jul 22, 2008 at 12:31 AM, Matthew T. O'Connor [EMAIL PROTECTED] wrote: Is DSM going to be in 8.4? The last I had heard, DSM+related improvements weren't close to being guaranteed for this release. If it doesn't make it, waiting another year and a half for something easily fixed would be fairly unacceptable. Should I provide a patch in the event that DSM doesn't make it? Can't hurt to submit a patch. Also, could you do something to help mitigate the worse case, something like don't update the stats in pg_class if the analyze finishes after a vacuum has finished since the current analyze started? Yeah, I was thinking about that. It should actually be very easy to do that. -Jonah -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pltcl_*mod commands are broken on Solaris 10
Zdenek Kotala napsal(a): Commands pltcl_listmod, pltcl_delmod, pltcl_loadmod does not work on Solaris 10, because there is not present tclsh. I found that tclsh is available on solaris 10 in /usr/sfw/bin and its name is tclsh8.3. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: primary key patches
Hi, chris wrote: I'll describe a scenario to suggest where it might happen. - A system is implemented, using the database, and, for some reason, no PRIMARY KEY is defined for a table. Someone forgot; it got misconfigured; a mistake was probably made. - The system then goes into production, and runs for a while. The table has data added to it, and starts to grow rather large. - At THIS point, we decide to introduce replication, only to discover that there isn't a PRIMARY KEY on the table. Yeah, that's the situation I had in mind as well. Ideally, we'd take an outage and add the primary key. But suppose we can't afford to do so? You are assuming that one doesn't need to take an outage to start replication in the first place. As Postgres-R comes with system catalog changes, that's not the case. You will at least need to restart the postmaster, without some sort of system catalog upgrading (which doesn't currently exists) you even need a full dump/restore cycle. The add indexes concurrently added in 8.3 (if memory serves) *would* allow us to create a *candidate* primary key without forcing an outage. Postgres-R is primarily being developed for *future* versions of Postgres, I don't see any point in back porting something that is not complete for the current version, yet. However, that's a general limitation of replication at tuple level: you need to be able to uniquely identify tuples. (Unlike replication on storage level, which can use the storage location for that). No disagreement; yes, we certainly do need a way to uniquely identify tuples, otherwise we can't replicate UPDATE or DELETE. Yup, that's the real issue here. The scenario I outline above hopefully answers this. I see the problem of wanting to replicate tables which didn't have a PRIMARY KEY before. But I still cannot see a use case for hiding indices or keys. It's not a matter that I expect people to specifically desire not to have a primary key. Instead, I expect cases where mistakes compound with operational issues to make them say Ow - I can't do that now! Yeah, these issues certainly need to be addressed. I think the ability to add a hidden column and a (visible!) primary key on that column should help in that case. Thinking about index creation time doesn't make sense, as long as we still need a dump/restore cycle to setup replication. And even then, that operational issue has nothing to do with the question of hiding the newly generated index or not. Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Tom Lane [EMAIL PROTECTED] writes: From a project-management point of view, it's insanity to set a presumption that pgfoundry is just a proving ground for code that should eventually get into core once it's mature enough or popular enough or whatever. We *have to* encourage the development of a cloud of subprojects around the core, or core will eventually collapse of its own weight. One option might be the Perl approach of having separately developed projects which are snapshotted at stable points and included in the release. It has the chance to offer the best of both worlds by offloading development outside of core but provide users with a perceived complete system. For perl this is important because they want programmers to be able to assume certain modules are present. For postgres the case is less compelling since there isn't an interoperability issue. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Tue, Jul 22, 2008 at 2:39 PM, Gregory Stark [EMAIL PROTECTED] wrote: Tom Lane [EMAIL PROTECTED] writes: From a project-management point of view, it's insanity to set a presumption that pgfoundry is just a proving ground for code that should eventually get into core once it's mature enough or popular enough or whatever. We *have to* encourage the development of a cloud of subprojects around the core, or core will eventually collapse of its own weight. One option might be the Perl approach of having separately developed projects which are snapshotted at stable points and included in the release. It has the chance to offer the best of both worlds by offloading development outside of core but provide users with a perceived complete system. Yeah, but then what happens when the offloaded development/maintenance doesn't happen? We'd end up pulling the package or having to maintain it ourselves anyway. /D -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pltcl_*mod commands are broken on Solaris 10
Zdenek Kotala [EMAIL PROTECTED] writes: Commands pltcl_listmod, pltcl_delmod, pltcl_loadmod does not work on Solaris 10, because there is not present tclsh. Shouldn't this bug be filed against Solaris' clearly-broken tcl installation? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] collation support revisited (phase 1)
Martijn van Oosterhout napsal(a): On Sat, Jul 12, 2008 at 10:02:24AM +0200, Zdenek Kotala wrote: Background: We specify encoding in initdb phase. ANSI specify repertoire, charset, encoding and collation. If I understand it correctly, then charset is subset of repertoire and specify list of allowed characters for language-collation. Encoding is mapping of character set to binary format. For example for Czech alphabet(charset) we have 6 different encoding for 8bit ASCII, but on other side for UTF8 there is specified multi charsets. Oh, so you're thinking of a charset as a sort of check constraint. If your locale is turkish and you have a column marked charset ASCII then storing lower('HI') results in an error. Yeah, if you use strcoll function it fails when illegal character is found. See http://www.opengroup.org/onlinepubs/009695399/functions/strcoll.html A collation must be defined over all possible characters, it can't depend on the character set. That doesn't mean sorting in en_US must do something meaningful with japanese characters, it does mean it can't throw an error (the usual procedure is to sort on unicode point). Collation cannot be defined on any character. There is not any relation between Latin and Chines characters. Collation has sense when you are able to specify = operators. If you need compare Japanese and Latin characters then ansi specify default collation for each repertoire. I think it is usually bitwise comparing. Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] plproxy v2
On Mon, Jul 21, 2008 at 09:32:57PM -0400, Tom Lane wrote: Marko Kreen [EMAIL PROTECTED] writes: 2. If cluster connection strings do not have 'user=' key, ' user=' || current_username() is appended to it. Cool, I missed that. At minimum the documentation has to explain this point and emphasize the security implications. Is it a good idea to allow user= in the cluster strings at all? I wondered about this myself. Is there anything at all preventing me from doing 'user=' for some other user? If not. . . Also, plroxy does _nothing_ with passwords. That means the password for remote connection must be in postgres user's .pgpass, That seems *exactly* backwards, because putting the password in postgres user's .pgpass is as good as disabling password auth altogether. . . .this means that any user on system1 for which there is at least one user on system2 with plproxy access automatically also has that access on system2. (Plus what Tom noted). We regularly get beat up about any aspect of our security apparatus that isn't secure by default. This definitely isn't, and from a PR point of view (if nothing else) that doesn't seem a good idea. I'm less worried about the PR, and more worried about the truck-sized hole this opens in any authentication controls. It seems to me that it's a fairly serious problem. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pltcl_*mod commands are broken on Solaris 10
Tom Lane napsal(a): Zdenek Kotala [EMAIL PROTECTED] writes: Commands pltcl_listmod, pltcl_delmod, pltcl_loadmod does not work on Solaris 10, because there is not present tclsh. Shouldn't this bug be filed against Solaris' clearly-broken tcl installation? I'm not able to make decision if tcl installation is broken on Solaris 10. tclsh is there but it is call tclsh8.3 and symbolic link is not there. But problem is also in configure which does not fail when tclsh is not found. I'm able to fix it on build machine to specify TCLSH environment variable, but still configure should be fixed. Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] plproxy v2
On 7/22/08, Andrew Sullivan [EMAIL PROTECTED] wrote: On Mon, Jul 21, 2008 at 09:32:57PM -0400, Tom Lane wrote: Marko Kreen [EMAIL PROTECTED] writes: 2. If cluster connection strings do not have 'user=' key, ' user=' || current_username() is appended to it. Cool, I missed that. At minimum the documentation has to explain this point and emphasize the security implications. Is it a good idea to allow user= in the cluster strings at all? I wondered about this myself. Is there anything at all preventing me from doing 'user=' for some other user? If not. . . For that you need to overwrite the plproxy.get_cluster_partitions() function or the data it operates on. I don't see any hole in this, unless explicitly created. Also, plroxy does _nothing_ with passwords. That means the password for remote connection must be in postgres user's .pgpass, That seems *exactly* backwards, because putting the password in postgres user's .pgpass is as good as disabling password auth altogether. . . .this means that any user on system1 for which there is at least one user on system2 with plproxy access automatically also has that access on system2. (Plus what Tom noted). For that the system2 needs to be added as partion to a cluster. Or specified explicitly in CONNECT statement. And user can execute only pre-determines queries/functions on system2. We regularly get beat up about any aspect of our security apparatus that isn't secure by default. This definitely isn't, and from a PR point of view (if nothing else) that doesn't seem a good idea. I'm less worried about the PR, and more worried about the truck-sized hole this opens in any authentication controls. It seems to me that it's a fairly serious problem. Do you still see a big hole? -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Dave Page [EMAIL PROTECTED] writes: On Tue, Jul 22, 2008 at 2:39 PM, Gregory Stark [EMAIL PROTECTED] wrote: Tom Lane [EMAIL PROTECTED] writes: From a project-management point of view, it's insanity to set a presumption that pgfoundry is just a proving ground for code that should eventually get into core once it's mature enough or popular enough or whatever. We *have to* encourage the development of a cloud of subprojects around the core, or core will eventually collapse of its own weight. One option might be the Perl approach of having separately developed projects which are snapshotted at stable points and included in the release. It has the chance to offer the best of both worlds by offloading development outside of core but provide users with a perceived complete system. Yeah, but then what happens when the offloaded development/maintenance doesn't happen? We'd end up pulling the package or having to maintain it ourselves anyway. Yeah, it's probably a plan which would work better once there's some solidly maintained external projects for an extended period of time. I suppose it's not entirely unlike the history of tsearch. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pltcl_*mod commands are broken on Solaris 10
Zdenek Kotala [EMAIL PROTECTED] writes: But problem is also in configure which does not fail when tclsh is not found. Yes it does ... if test $[#] -eq 0; then test -z $TCLSH AC_MSG_ERROR([unable to locate tclConfig.sh because no Tcl shell was found]) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] plproxy v2
Andrew Sullivan [EMAIL PROTECTED] writes: On Mon, Jul 21, 2008 at 09:32:57PM -0400, Tom Lane wrote: Marko Kreen [EMAIL PROTECTED] writes: 2. If cluster connection strings do not have 'user=' key, ' user=' || current_username() is appended to it. Cool, I missed that. At minimum the documentation has to explain this point and emphasize the security implications. Is it a good idea to allow user= in the cluster strings at all? I wondered about this myself. Is there anything at all preventing me from doing 'user=' for some other user? If not. . . I think the assumption is that the cluster connection info would be set up by a superuser. However, if there's any way for a non-superuser to subvert the info returned by the plproxy configuration functions, you got trouble. So a lot would depend on how carefully those are coded. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: primary key patches
Markus Wanner [EMAIL PROTECTED] writes: Thinking about index creation time doesn't make sense, as long as we still need a dump/restore cycle to setup replication. And even then, that operational issue has nothing to do with the question of hiding the newly generated index or not. Let me note that one of the design criteria for Slony-I was to explicitly NOT have such a requirement. Making the assumption that it *is* acceptable to disrupt operations for the duration of a dump/restore cycle is certain to limit interest in a replication system. A most pointed case where that will cause heartburn of the I refuse to use this sort is if that disruption needs to take place when recovering from the failure of a node. That sort of disruption is certainly counterproductive to the usual goal of replication enhancing system availability. Maybe I am misreading you; I rather hope so. -- select 'cbbrowne' || '@' || 'linuxfinances.info'; http://cbbrowne.com/info/lsf.html Rules of the Evil Overlord #145. My dungeon cell decor will not feature exposed pipes. While they add to the gloomy atmosphere, they are good conductors of vibrations and a lot of prisoners know Morse code. http://www.eviloverlord.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Plans for 8.4
I'm looking at shortlisting things that I'll be working on for 8.4. I've thrown out a few small patches I had, but now am looking at these projects: * Hot Standby * Logical replication via transaction log * Integration with synchronous replication * Join removal * stats hooks * DDL locking reductions Initial design work on them is mostly done, just need to start posting designs and patches. I've got a early days prototype of Hot Standby that allows connection, plus months of thinking on the other topics, so I'm further ahead than it may previously have appeared. All of this is dependent upon sponsorship... THANK YOU to current sponsors. I've had to break off work on MERGE now. For lack of time, plus the presence of some unresolved details in the design that make it questionable whether it would be fully accepted. I've also punted on a few other items that look like more time than value to the project, in my opinion. But it might be possible to pick up later, depending what happens in next few months. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pltcl_*mod commands are broken on Solaris 10
Tom Lane napsal(a): Zdenek Kotala [EMAIL PROTECTED] writes: But problem is also in configure which does not fail when tclsh is not found. Yes it does ... if test $[#] -eq 0; then test -z $TCLSH AC_MSG_ERROR([unable to locate tclConfig.sh because no Tcl shell was found]) Yeah, I looked deeply on our solaris build script and problem is with following configure setup: ./configure --prefix=/tmp/pg --with-tcl --with-tclconfig=/usr/sfw/lib It found tclconfig, but not tclsh ... checking for tclsh... no checking for tcl... no checking for tclConfig.sh... /usr/sfw/lib/tclConfig.sh ... and configure finish successfully but plttcl_* scripts are broken. If I define TCLSH env variable it seems to me be OK. ... checking for tclsh... /usr/sfw/bin/tclsh8.3 checking for tclConfig.sh... /usr/sfw/lib/tclConfig.sh ... Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] plproxy v2
On 7/22/08, Tom Lane [EMAIL PROTECTED] wrote: Marko Kreen [EMAIL PROTECTED] writes: On 7/21/08, Tom Lane [EMAIL PROTECTED] wrote: I looked through this a bit, and my principal reaction was what are the security implications? There are 2 aspects to it: 1. Function can be created only by superuser. What I'm concerned about is who they can be *called* by. I'd be happier if the default behavior was that there was no public execute privilege for plproxy functions. I think right now that could be enforced by having plproxy's validator procedure replace any null proacl entry with something that explicitly refuses public execute. That's a bit of a hack though. Maybe it'd be worth inventing per-PL default ACLs, instead of having a one-size-fits-all policy? Note1 that if user (admin) wants he can also do user filtering/mapping in plproxy.* functions. Note2 - instead of restricting privileges on actual functions, we could instead restrict privilege on the 2 functions under 'plproxy' schema, or directly on schema. Seems simpler. Eg. create simple default installation with REVOKE ALL FROM PUBLIC. 2. If cluster connection strings do not have 'user=' key, ' user=' || current_username() is appended to it. Cool, I missed that. At minimum the documentation has to explain this point and emphasize the security implications. Ok. Is it a good idea to allow user= in the cluster strings at all? I think so - if the plproxy database itself already is main point of authentication, both can-connect and can-execute sense, then it's good to avoid complicating setup and send queries away under single user that has minimal rights on partition dbs and can only execute requested functions. Also, plroxy does _nothing_ with passwords. That means the password for remote connection must be in postgres user's .pgpass, That seems *exactly* backwards, because putting the password in postgres user's .pgpass is as good as disabling password auth altogether. Consider that it would also hand all the keys to the kingdom over to someone who had access to dblink on the same machine (not even the same cluster, so long as it was run by the same postgres user!). Good point. Some ideas for password handling: 1. Require that user always provider both username and password in plproxy.get_cluster_partitions(). We could add separate function for that or add fields to plproxy.get_partitions(), although this is not necessary - user can add them simply to connect string. Main problems with this is that maybe you don't want to show the passwords to anyone who can execute plproxy.* functions? 2. Let PL/Proxy fetch user password hash from pg_shadow, add API to libpq to use the hash on authentication instead plaintext password. This ofcourse expects that remote server uses same auth method as current one. Despite appearance it does not have security problems - the hashes are already equivalent to plaintext password. But I don't think plproxy can and should protect dumb admins who create remote_exec(sql) function and allow untrusted users to execute it. We regularly get beat up about any aspect of our security apparatus that isn't secure by default. This definitely isn't, and from a PR point of view (if nothing else) that doesn't seem a good idea. I repeat that I don't feel comfortable in the least with plproxy's security model. Btw, I'm very thankful for your review. I would really like improve the security of plproxy whatever the merge decision will be, so hopefully we can discuss it further. To make discussion easier, here are list of possible problems/fixes discussed thus far (as I see): Problems: - restrict users who can access remote dbs by default. - avoid spreading passwords too far. - .pgpass gives them to any libpq client - inside connect string they are visible to calling user (although only his own?) Documentation/default setup fixes: 1. Restrict access to 'plproxy' schema or functions under that schema. Only users that have grants can use plproxy functions. 2. Create default setup that does user filtering/mapping by default. Have the permissions on the functions and tables carefully tuned to allow minimal access. 3. Make the default setup also handle passwords from tables. So instead user adding password handling insecurely, he can use it or remove it from already secure setup. Code fixes: 4. Create plproxy functions without execute permissions by default. (Seems unnecessary as 1, 2 already give that?) 5. Let plproxy use user password hash directly from pg_shadow. (Unless user= or password= is given on connection string?) Seems like restricting access is easy, but only 5) gives secure password handling. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] [patch] plproxy v2
Marko Kreen [EMAIL PROTECTED] writes: And user can execute only pre-determines queries/functions on system2. If that were actually the case then the security issue wouldn't loom quite so large, but the dynamic_query example in the plproxy regression tests provides a perfect example of how to ruin your security. Do you still see a big hole? Truck-sized, at least. The complaint here is not that it's impossible to use plproxy securely; the complaint is that it's so very easy to use it insecurely. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema-qualified statements in pg_dump output
On Mon, 2008-07-21 at 23:53 -0400, Owen Hartnett wrote: It would generate a schema dump where all the references to schema_name were replaced by masquerade_name. Good idea, can I tweak that a bit? No need to specify the name at pg_dump time. For text files, just use an option to specify whether we change the actual schema name and replace it with the text :PGDUMPSCHEMA. pg_dump --relocateable-schema (or alternate option name) Then when we reload, we just run psql -f pgdump.file -v PGDUMPSCHEMA=newlocation -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: primary key patches
Hi, Christopher Browne wrote: Markus Wanner [EMAIL PROTECTED] writes: Thinking about index creation time doesn't make sense, as long as we still need a dump/restore cycle to setup replication. And even then, that operational issue has nothing to do with the question of hiding the newly generated index or not. Let me note that one of the design criteria for Slony-I was to explicitly NOT have such a requirement. That's a pretty cool feature, but hasn't been one of the primary design goal of Postgres-R. Making the assumption that it *is* acceptable to disrupt operations for the duration of a dump/restore cycle is certain to limit interest in a replication system. I agree, that's certainly true. A most pointed case where that will cause heartburn of the I refuse to use this sort is if that disruption needs to take place when recovering from the failure of a node. That sort of disruption is certainly counterproductive to the usual goal of replication enhancing system availability. Huh? What does migration between major Postgres versions have to do with node failures or recoveries? System availability certainly *is* one of the primary design goals of Posgres-R. Thus, once installed and running, you certainly don't need any such procedure again. Certainly not due to node failures. It would be nice if future upgrades (i.e. major version upgrades) of single nodes could be done while the rest of the cluster is running. That would mean having a pretty static binary change set communication protocol, which works between different major Postgres versions. That's certainly planned, but hey, we don't have a production ready version for *any* major version, yet. Maybe I am misreading you; I rather hope so. With an additional process and schema changes, Postgres-R takes quite a different approach than Slony. I don't think that would have been possible without forcing at least a Postmaster restart. The schema changes are pretty minimal and can probably be done manually (well, script driven, perhaps) before restarting with a Postmaster which has replication compiled in. That would save the dump/restore cycle, but certainly not the Postmaster restart. However, with regard to the catalog version, Postgres-R can be thought of as another major version of Postgres. (Maybe I should even extend the catalog version with an 'R' or something, so as to prevent normal Postgres version from running on a data directory of a Postgres-R installation). Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] GIN improvements
Teodor Sigaev [EMAIL PROTECTED] writes: Updated: http://www.sigaev.ru/misc/fast_insert_gin-0.9.gz I still havn't clearness of acceptability for suggested aminsertcleanup calling. I started to look at this. I don't understand why VACUUM does an insert cleanup before starting to vacuum, but VACUUM FULL doesn't? I don't particularly like the idea of adding aminsertcleanup calls immediately before other AM operations such as ambulkdelete. It seems to me that those operations ought to include the cleanup subroutine themselves, if they need it; they shouldn't depend on callers to get this right. Offhand it looks to me like the only new index AM call needed is the one at vacuum startup, which tempts me to propose that the new AM entry point should be called amvacuumstartup, instead of wiring in the assumption that what it's for is specifically cleanup of insertions. Comments? I can make the change if you think it's okay --- I'm busy cleaning up docs and comments at the moment. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] phrase search
1. What is the meaning of such a query operator? foo #5 bar - true if the document has word foo followed by bar at 5th position. foo #5 bar - true if document has word foo followed by bar with in 5 positions foo #5 bar - true if document has word foo followed by bar after 5 positions Sounds good, but, may be it's an overkill. etc . 2. How to implement such query operators? Should we modify QueryItem to include additional distance information or is there any other way to accomplish it? Is the following list sufficient to accomplish this? a. Modify to_tsquery b. Modify TS_execute in tsvector_op.c to check new operator Exactly Is there anything needed in rewrite subsystem? Yes, of course - rewrite system should support that operation. 3. Are these valid uses of the operators and if yes what would they mean? foo #5 (bar cup) It must support! Because of lexize might return subtsquery. For example, russian ispell can return several lexemes: adfg can become a 'adf | adfs | ad', norwegian and german languages are more complicated: abc - (ab c) | (a bc) | abc 4. If the operator only applies to two query items can we create an index such that (foo, bar)- documents[min distance, max distance] How difficult it is to implement an index like this? No, index should execute query 'foo bar' and mark recheck flag to true to execute 'foo #5 bar' on original tsvector from table. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Dave Cramer wrote: On 21-Jul-08, at 4:28 PM, Andrew Sullivan wrote: On Mon, Jul 21, 2008 at 01:17:39PM -0700, David E. Wheeler wrote: pgFoundry ain't the CPAN, alas. Maybe that's the problem that really needs solving? One of the big Postgres features is its extensibility. I agree that the extensions can sometimes be hard to find, but surely the answer to that is not an infinitely large source tarball? I'd have to agree with Andrew here. Making it easy to get extensions would solve lots of problems. What about starting a secondary team that would review extensions? Projects on pgfoundry could be identified as reviewed and approved as a type of recommendation that they are of acceptable quality to use in production - maybe against certain versions. What I would see is current core developers teaching a new group of developers to do the add-on code reviews to a point where they could continue on by themselves - one or two from core may wish to stay in this group - with core checking in from time to time to ensure the quality doesn't slip. Thereby giving some confidence in the use of the add-ons that get *certified*. A new add-on would be presented to this group and maybe voted on in one of the lists (General or Admin?) to get acceptance into the review process. Anyone interested in starting this? I do agree that the main code doesn't need to contain every feature that is available. But we do need to improve the perception of add-ons. Hardly anyone thinks twice about adding an extension to firefox, perl, gimp or oscommerce or even drivers to the os, and we need to aim for a similar thought here. I do think that having a list of reviewed and approved add-ons that is easily found on the main site along with release downloads will help along those lines. We need to promote that postgresql isn't a one-size-fits-all solution, it is a solid product that can be customised to suite your needs. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Transaction-controlled robustness for replication
One of the cool features of 8.3 was the ability to control at the transaction level whether we would use synchronous or asynchronous commit. We're planning to add integrated replication features to 8.4, and I think it will be possible to extend the concept of asynchronous commit to a more general form of transaction-level robustness. Note that the proof that its possible to mix asynchronous and synchronous transactions on the same system has already been established, so this is just a straight extension of that concept. Asynchronous commit controls whether we go to disk at time of commit, or whether we defer this slightly. We have the same options with replication: do we replicate at time of commit, or do we defer this slightly for performance reasons. DRBD and other replication systems show us that there is actually another difference when talking about synchronous replication: do we go to disk on the standby before acknowledging the primary? We can generalise this as three closed questions, answered either Yes (Synchronous) or No (Asynchronous) * Does WAL get forced to disk on primary at commit time? * Does WAL get forced across link to standby at commit time? * Does WAL get forced to disk on standby at commit time? In code, these are simple if tests: Do we wait, or not? We could represent this with 3 parameters: synchronous_commit = on | off synchronous_standby_transfer = on | off synchronous_standby_wal_fsync = on | off If we are able to define these robustness characteristics for each transaction *separately* then it will represent an industry first: no other database has that capability implemented or planned on published roadmaps, nor has it been discussed in research to my knowledge. Changing the parameter setting at transaction-level would be expensive if we had to set three parameters. Or we could use just two parameters: synchronous_commit = on | off synchronous_replication = 'AA', 'SA' or 'SS' with A = Asynchronous, S = Synchronous which corresponds with DRBD's algorithms like this DRBD A = AA DRBD B = SA DRBD C = SS Or we could use just a single parameter synchronous_commit = 'AAA', 'SAA', 'SSA', 'SSS' or on |off when no log-based replication is defined Having the ability to set these at the transaction-level would be very cool. Having it set via a *single* parameter would make it much more viable to switch between AAA for bulk, low importance data and SSS for very low volume, critical data, or somewhere in between on the same server, at the same time. So proposal in summary is * allow various modes of synchronous replication for perf/robustness * allow modes to be specified per-transaction * allow modes to be specified as a single parameter I think Itagaki may have described similar concepts at PGCon2008, but this thread has been started to make sure that meme definitely has been released into the wild, and to discuss how we might specify it? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] GIN improvements
I started to look at this. I don't understand why VACUUM does an insert cleanup before starting to vacuum, but VACUUM FULL doesn't? Hmm. May be I missed something, but I don't understand where and what... I tried to track all places of ambultdelete call. aminsertcleanup should be called before any ambulkdelete, because ambulkdelete doesn't scan pending list which can store items to be deleted and hence index will store item pointers to absent tuples. needed is the one at vacuum startup, which tempts me to propose that the new AM entry point should be called amvacuumstartup, instead of wiring in the assumption that what it's for is specifically cleanup of insertions. That's possible but inserts into index should be forbidden between amvacuumstartup and last call of ambulkdelete. Comments? I can make the change if you think it's okay --- I'm busy cleaning up docs and comments at the moment. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: primary key patches
Markus Wanner wrote: Ideally, we'd take an outage and add the primary key. But suppose we can't afford to do so? You are assuming that one doesn't need to take an outage to start replication in the first place. As Postgres-R comes with system catalog changes, that's not the case. You will at least need to restart the postmaster, without some sort of system catalog upgrading (which doesn't currently exists) you even need a full dump/restore cycle. Hey, for Replicator I wrote a bootstrapping system for catalog upgrading -- it starts a special bootstrap mode and allows creating new catalogs, their indexes, and a bunch of functions. Afterwards everything is considered internal. It's quite hackish but it works ... -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema-qualified statements in pg_dump output
Simon Riggs [EMAIL PROTECTED] writes: No need to specify the name at pg_dump time. For text files, just use an option to specify whether we change the actual schema name and replace it with the text :PGDUMPSCHEMA. pg_restore is in even worse position than pg_dump to make this happen; it would not be able to do anything that's smarter than a sed-like substitution. I doubt that the original idea can be made to work, but this improvement will entirely guarantee failure. (Note: the problem is not so much with the names of the objects you're directly creating, as with object cross-references that're embedded in the DDL.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] GIN improvements
Teodor Sigaev [EMAIL PROTECTED] writes: That's close to trivial to revert this piece to add cleanup call to ginbulkdelete/ginvacuumcleanup. Early variants used this variant. Yeah, I think we should do it that way. On reflection I don't think you even need the amvacuumstartup call, because it is *not* safe to assume that an index cleanup operation there will guarantee that vacuum won't try to remove pending tuples. Remember that a tuple inserted by a transaction that later aborted is DEAD and can be reclaimed instantly by VACUUM. So while in the case of VACUUM FULL it might be okay to call index_cleanup only once, for regular VACUUM I think you really have to call it within each bulkdelete operation. There's probably no point in optimizing it away in VACUUM FULL either, since surely it'll be fast to call index_cleanup when there's nothing in the pending list? - I thought about statistic-based trigger for separate call of insertcleanup. Trigger should be fired on massive insert/update events very similar to trigger on massive delete for ambulkdelete. I'm very sorry but I didn't do it yet, and definitely I need some help here. Yeah, I was going to complain about that next :-). Autovacuum isn't going to trigger as a result of INSERT operations; somehow we have to teach it what to do for GIN indexes. I remember we discussed this at PGCon but I don't think we decided exactly what to do... Do I revert that piece? I've already made a number of changes to the patch; let me keep working on it and send it back to you later. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] collation support revisited (phase 1)
Tom Lane napsal(a): Zdenek Kotala [EMAIL PROTECTED] writes: I think if we support UTF8 encoding, than it make sense to create own charsets, because system locales could have defined collation for that. Say what? I cannot imagine a scenario in which a user-defined encoding would be useful. I did not mean user defined encoding but user defined charset. For example [EMAIL PROTECTED] locale uses UTF8 encoding and collation is defined on czech charset which specifies list of allowed character. If somebody will have installed e.g. Turkish locale then he will want to have also Turkish charset in postgres. I guess, Charset also defines how upper/lower case will work (see i/I in Turkish). Please, correct me if I wrong. thanks Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] GIN improvements
Teodor Sigaev [EMAIL PROTECTED] writes: I started to look at this. I don't understand why VACUUM does an insert cleanup before starting to vacuum, but VACUUM FULL doesn't? Hmm. May be I missed something, but I don't understand where and what... I tried to track all places of ambultdelete call. aminsertcleanup should be called before any ambulkdelete, because ambulkdelete doesn't scan pending list which can store items to be deleted and hence index will store item pointers to absent tuples. needed is the one at vacuum startup, which tempts me to propose that the new AM entry point should be called amvacuumstartup, instead of wiring in the assumption that what it's for is specifically cleanup of insertions. That's possible but inserts into index should be forbidden between amvacuumstartup and last call of ambulkdelete. Well, if that is required to be true then this whole design is pretty broken, because VACUUM doesn't hold any lock that would guarantee that no insert happens between the two calls. If we fold the two AM calls into one call then it'd be okay for the index AM to take such a lock transiently during the single index-cleanup-plus-bulkdelete call. For VACUUM FULL there's no such issue because the whole table is locked, but I still don't see any real point in having two successive index AM calls when the AM could perfectly well do all the work in one call. Maybe it'd be better if ambulkdelete *did* scan the pending list? You'd still need at least page-level locking but perhaps not anything stronger. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] GIN improvements
Well, if that is required to be true then this whole design is pretty broken, because VACUUM doesn't hold any lock that would guarantee that no insert happens between the two calls. If we fold the two AM calls into one call then it'd be okay for the index AM to take such a lock transiently during the single index-cleanup-plus-bulkdelete call. Actually, lock doesn't needed. Just bulkdelete should not try to remove not yet insertcleanuped items pointer. That's easy because VacPageList is prepared before insertcleanup call. Maybe it'd be better if ambulkdelete *did* scan the pending list? I don't like that idea because it requires to add a lot of code (concurrent deletion of pages in list), much simpler to call insertcleanup inside ginbulkdelete/ginvacuumcleanup. You'd still need at least page-level locking but perhaps not anything stronger. That's close to trivial to revert this piece to add cleanup call to ginbulkdelete/ginvacuumcleanup. Early variants used this variant. Reasons for new variant was: - defining needing of call of insertcleanup, and stats argument was used for it in both function. If it's a NULL then call cleanup. - I thought about statistic-based trigger for separate call of insertcleanup. Trigger should be fired on massive insert/update events very similar to trigger on massive delete for ambulkdelete. I'm very sorry but I didn't do it yet, and definitely I need some help here. Do I revert that piece? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Plans for 8.4
Hi, Simon Riggs wrote: * Logical replication via transaction log * Integration with synchronous replication I'm curious on what you mean by these two points. AFAIK you are speaking of logical replication for methods which don't rely on the internal storage format of the database, but instead replicate closer to the SQL level. But the transaction log (the XLOG for Postres, REDO log for Oracle, etc..) is certainly bound to the storage format, so I'd classify the log shipping approaches as physical replication. What is it you are talking about? What do you plan to integrate with synchronous replication? Did you consider having a look at the Postgres-R project? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema-qualified statements in pg_dump output
On Tue, 2008-07-22 at 16:58 +0100, Simon Riggs wrote: On Mon, 2008-07-21 at 23:53 -0400, Owen Hartnett wrote: No need to specify the name at pg_dump time. For text files, just use an option to specify whether we change the actual schema name and replace it with the text :PGDUMPSCHEMA. pg_dump --relocateable-schema (or alternate option name) Then when we reload, we just run psql -f pgdump.file -v PGDUMPSCHEMA=newlocation I like the idea but would prefer no shell variable (I assume that is what you are using above). Why not just -X target-schema=newlocation or something like that? Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Plans for 8.4
On Tue, 2008-07-22 at 10:23 -0700, David E. Wheeler wrote: On Jul 22, 2008, at 08:35, Simon Riggs wrote: * Hot Standby * Logical replication via transaction log * Integration with synchronous replication Getting these in 8.4 would so rock. Agreed. Thank me when its done, 'cos it ain't done yet by a long way! -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: primary key patches
Le mardi 22 juillet 2008, Christopher Browne a écrit : A most pointed case where that will cause heartburn of the I refuse to use this sort is if that disruption needs to take place when recovering from the failure of a node. That sort of disruption is certainly counterproductive to the usual goal of replication enhancing system availability. Maybe I am misreading you; I rather hope so. This part of Markus's mail makes me think the need may change if Postgres-R is ever integrated into -core: Le mardi 22 juillet 2008, Markus Wanner a écrit : As Postgres-R comes with system catalog changes, that's not the case. So currently to use Postgres-R you'd have to start with a patched code base at each and every node, because it's how Markus wanted to proceed (Postgres-R being a separated code base). In Postgres-R adding a node to the cluster is what is done without dump/restore cycle. Now that he's Open-Sourcing the solution, I hope to see this mode of operation change, thanks to integration of some key part (catalog changes) of the project into -core, if possible. Note that while slony doesn't require a dump/restore to get activated, it seems to me (as a non user of it) that it still plays with catalog, preventing normal usage of pg_dump... I'm not sure which disease I prefer: not being able to dump/restore normally or getting to have to restore on a specific product version, not the -core one. Just my 2 cents, hoping I'm understanding correctly the point at hand here, -- dim signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] pltcl_*mod commands are broken on Solaris 10
Tom Lane wrote: Zdenek Kotala [EMAIL PROTECTED] writes: But problem is also in configure which does not fail when tclsh is not found. Yes it does ... if test $[#] -eq 0; then test -z $TCLSH AC_MSG_ERROR([unable to locate tclConfig.sh because no Tcl shell was found]) Does that happen if you specify the location of tclConfig.sh? I assume it usually knows where tclsh is, but the pltcl utilities won't. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Plans for 8.4
On Tue, 2008-07-22 at 18:37 +0200, Markus Wanner wrote: Simon Riggs wrote: * Logical replication via transaction log * Integration with synchronous replication I'm curious on what you mean by these two points. AFAIK you are speaking of logical replication for methods which don't rely on the internal storage format of the database, but instead replicate closer to the SQL level. But the transaction log (the XLOG for Postres, REDO log for Oracle, etc..) is certainly bound to the storage format, so I'd classify the log shipping approaches as physical replication. What is it you are talking about? Reconstructing SQL from WAL, allowing logical apply. So yes, you're right, the actual replication of the data from one node to another is physical - its the final apply of the changes that is logical. So this fits neatly with our commitment to put synchronous replication into server. It allows WAL to be used as the replication transport, which reduces the impact of tracking changes via triggers. The presence of commit records in the sequence can be used to parallelize the apply phase if required. I'm proposing to put the mechanisms in place to allow WAL to be used by other replication projects. Which turn out to be relatively small changes to the main server. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: primary key patches
Hi, Dimitri Fontaine wrote: This part of Markus's mail makes me think the need may change if Postgres-R is ever integrated into -core: Yes, in that case, you'd have replication already compiled in and distributed with standard Postgres. However, ATM that's pipe dreaming and I'm pretty sure no developer (neither me nor Postgres hackers) want to mix code (and responsibility!) at this stage of development of Postgres-R. The most I'd be willing to ask for at the moment would be to get a range of OIDs reserved for use in Postgres-R. It would not make sense at the moment to add the schema changes to stardard Postgres, because I will pretty have to change these again. So currently to use Postgres-R you'd have to start with a patched code base at each and every node, because it's how Markus wanted to proceed (Postgres-R being a separated code base). In Postgres-R adding a node to the cluster is what is done without dump/restore cycle. Yup. Now that he's Open-Sourcing the solution, I hope to see this mode of operation change, thanks to integration of some key part (catalog changes) of the project into -core, if possible. Sorry, but at the moment, I disagree, because I think this would complicate matters for both projects. This might (and hopefully will) change, sure. But we are not there, yet. Note that while slony doesn't require a dump/restore to get activated, it seems to me (as a non user of it) that it still plays with catalog, preventing normal usage of pg_dump... Oh, does it? Well, it obviously doesn't require a Postmaster restart, nor does it add a separate background process. I'm not sure which disease I prefer: not being able to dump/restore normally or getting to have to restore on a specific product version, not the -core one. I think this process of moving between ordinary Postgres and Postgres-R schema variants for the same(!) major version can be automated. It would be a pretty small pg_upgrade sort of tool. I'm not that afraid of these schema changes. Heck, in the worst case, we could even let Postgres-R add them itself during startup. Sorry if this sounds a little rude. I've just had the 'why isn't Postgres-R integrated?' discussion a little too often. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] collation support revisited (phase 1)
Martijn van Oosterhout napsal(a): On Mon, Jul 21, 2008 at 03:15:56AM +0200, Radek Strnad wrote: I was trying to sort out the problem with not creating new catalog for character sets and I came up following ideas. Correct me if my ideas are wrong. Since collation has to have a defined character set. Not really. AIUI at least glibc and ICU define a collation over all possible characters (ie unicode). When you create a locale you take a subset and use that. Think about it: if you want to sort strings and one of them happens to contain a chinese charater, it can't *fail*. Note strcoll() has no error return for unknown characters. It has. See http://www.opengroup.org/onlinepubs/009695399/functions/strcoll.html The strcoll() function may fail if: [EINVAL] [CX] The s1 or s2 arguments contain characters outside the domain of the collating sequence. I'm suggesting to use already written infrastructure of encodings and to use list of encodings in chklocale.c. Currently databases are not created with specified character set but with specified encoding. I think instead of pointing a record in collation catalog to another record in character set catalog we might use only name (string) of the encoding. That's reasonable. From an abstract point of view collations and encodings are orthoginal, it's only when you're using POSIX locales that there are limitations on how you combine them. I think you can assume a collation can handle any characters that can be produced by encoding. I think you are not correct. You cannot use collation over all UNICODE. See http://www.unicode.org/reports/tr10/#Common_Misperceptions. Same characters can be ordered differently in different languages. Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema-qualified statements in pg_dump output
On Tue, 2008-07-22 at 13:35 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: No need to specify the name at pg_dump time. For text files, just use an option to specify whether we change the actual schema name and replace it with the text :PGDUMPSCHEMA. pg_restore is in even worse position than pg_dump to make this happen; it would not be able to do anything that's smarter than a sed-like substitution. Somebody just needs to check carefully to see what will work. I accept there is no easy option that is materially better than sed. I've screwed up a dump with sed, luckily noticed. I'm not playing Russian Roulette again. The chance of the schema name being stored somewhere in the database seems high, on reflection. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Mon, 2008-07-21 at 15:43 -0400, Tom Lane wrote: From a maintenance point of view there seems little need for either project to get integrated: they don't appear to have much of any code that is tightly tied to backend innards. This is a slightly circular argument. They have had to be written with no linkage to core to allow them to be created outside of it. I agree with your general principles on inclusion of features and also agree that in this specific case the patches should be rejected. Growing up outside of core cannot be a reason to exclude new capabilities from core, but it is probably a reason to reject specific code. In both these cases, I can see that the capability could be provided in a different way and benefit from tighter integration. I think we should return them with comments that if you integrate them more with core *and* can justify having done so, then we might include those features later. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Slony-I playing with system catalog
Dimitri Fontaine [EMAIL PROTECTED] writes: Note that while slony doesn't require a dump/restore to get activated, it seems to me (as a non user of it) that it still plays with catalog, preventing normal usage of pg_dump... FYI, that will no longer be the case in version 2.0 of Slony-I; with the changes made in 8.3, it is no longer necessary to play with the catalog in the unclean ways that have traditionally made pg_dump break. -- select 'cbbrowne' || '@' || 'linuxfinances.info'; http://cbbrowne.com/info/lsf.html Rules of the Evil Overlord #145. My dungeon cell decor will not feature exposed pipes. While they add to the gloomy atmosphere, they are good conductors of vibrations and a lot of prisoners know Morse code. http://www.eviloverlord.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pltcl_*mod commands are broken on Solaris 10
Zdenek Kotala [EMAIL PROTECTED] writes: Yeah, I looked deeply on our solaris build script and problem is with following configure setup: ./configure --prefix=/tmp/pg --with-tcl --with-tclconfig=/usr/sfw/lib It found tclconfig, but not tclsh Ah. So actually there is a bug in our configure: if you've set --with-tcl, and it fails to find tclsh, it should error out instead of allowing an incorrect path to be substituted into the pltcl_*mod scripts. The configure code is assuming that the only thing it really needs tclsh for is to find tclConfig.sh, but that's not so. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Plans for 8.4
On Jul 22, 2008, at 08:35, Simon Riggs wrote: * Hot Standby * Logical replication via transaction log * Integration with synchronous replication Getting these in 8.4 would so rock. Thank you! David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Jul 22, 2008, at 12:51, Simon Riggs wrote: I think we should return them with comments that if you integrate them more with core *and* can justify having done so, then we might include those features later I believe I've done both these things for citext, though if there is more to be done, I'm glad to do it. New patch coming later today, BTW. Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: primary key patches
Dimitri Fontaine [EMAIL PROTECTED] writes: Note that while slony doesn't require a dump/restore to get activated, it seems to me (as a non user of it) that it still plays with catalog, preventing normal usage of pg_dump... As of 8.3 there are some new trigger features in core that were put there for Slony. I'm not sure to what extent that will let them get rid of making nonstandard catalog changes ... perhaps Chris or Jan can explain. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: CITEXT 2.0 v4
On Jul 18, 2008, at 01:39, Michael Paesold wrote: Calling regex functions with the case-insensitivity option would be great. It should also be possible to rewrite replace() into regexp_replace() by first escaping the regex meta characters. Actually re-implementing those functions in a case insensitive way would still be an option, but of course some amount of work. The question is, how much use case there is. I've figured out how to make all the functions work using SQL function workarounds, converting things and re-dispatching to the text versions as appropriate. They work quite well, and can be converted to C later if that becomes a requirement. Meanwhile, on the question of whether or not regular expression and LIKE comparisons *should* match case-insensitively, I have a couple more observations: * Thinking about how a true case-insensitive collation would work, I'm quite certain that it would match case-insensitively. Anything else would just be unexpected, because in a case-insensitive collation, lowercase characters are, in practice, identical to uppercase characters. As far as matching is concerned, there is no difference between them. So the matching operators and functions against CITEXT should follow that assumption. * I tried a few matches on MySQL, where the collation is case- insensitive by default, and it confirms my impression: mysql select 'Foo' regexp 'o$'; +---+ | 'Foo' regexp 'o$' | +---+ | 1 | +---+ 1 row in set (0.00 sec) mysql select 'Foo' regexp 'O$'; +---+ | 'Foo' regexp 'O$' | +---+ | 1 | +---+ 1 row in set (0.00 sec) mysql select 'Foo' like '%o'; +-+ | 'Foo' like '%o' | +-+ | 1 | +-+ 1 row in set (0.00 sec) mysql select 'Foo' like '%O'; +-+ | 'Foo' like '%O' | +-+ | 1 | +-+ 1 row in set (0.00 sec) I'll grant that MySQL may not be the best model for how things should work, but it's something, at least. Anyone else got access to another database with case-insensitive collations to see how LIKE and regular expressions work? Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Plans for 8.4
Simon Riggs [EMAIL PROTECTED] writes: On Tue, 2008-07-22 at 18:37 +0200, Markus Wanner wrote: What is it you are talking about? Reconstructing SQL from WAL, allowing logical apply. ... I'm proposing to put the mechanisms in place to allow WAL to be used by other replication projects. Which turn out to be relatively small changes to the main server. [ retrieves eyebrows from ceiling... ] I doubt that's even possible, let alone a relatively small change. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Tom, Simon, etc.: Of the several things which PostgreSQL could learn from MySQL which we covered at pgCon was that the requirement to hunt hither and yon for popular add-ins is one of the primary reasons for developers not using PostgreSQL. Further, one of the main reasons why people do use PostgreSQL is our advanced functionality. If we focus only on core SQL features, there are few reasons to use us over MySQL, Oracle express, SQL Server, or Firebird. Minimalism isn't its own reward. Obviously Tom has reason to worry about the overall maintenance effort for the PostgreSQL code. But we need to balance that against the need to add features that users want and will keep our community growing. If the way to do this is by packaging stuff together but maintaining separate CVS trees, then ok -- but then we need a plan for how we're going to do that, rather than just rejecting patches. The general case aside, I really feel strongly that citext belongs in core unless we come up with some other means to do case-insensitive text. It's one of the top 10 newbie questions. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Plans for 8.4
Hi, Simon Riggs wrote: Reconstructing SQL from WAL, allowing logical apply. So yes, you're right, the actual replication of the data from one node to another is physical - its the final apply of the changes that is logical. So this fits neatly with our commitment to put synchronous replication into server. Aha, thanks for your explanation. It allows WAL to be used as the replication transport, which reduces the impact of tracking changes via triggers. The presence of commit records in the sequence can be used to parallelize the apply phase if required. I note you are comparing against trigger based replication solutions. I'm proposing to put the mechanisms in place to allow WAL to be used by other replication projects. Which turn out to be relatively small changes to the main server. Comparing with the Postgres-R way to do it, there's a difference in the serialized change set format. Where Postgres-R uses it's a specialized format, you are proposing to use the existing WAL format to transfer change set data, which is architecture dependent and quite tightly bound to the physical format on disc. It cannot be changed to also carry other information required for replication, without having an impact on WAL for single node installations. I think the CPU load to serialize and deserialize is pretty similar and most probably negligible. Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2008-07-21 at 15:43 -0400, Tom Lane wrote: From a maintenance point of view there seems little need for either project to get integrated: they don't appear to have much of any code that is tightly tied to backend innards. This is a slightly circular argument. They have had to be written with no linkage to core to allow them to be created outside of it. True, but in the form in which they are currently presented there is no (technical) reason to integrate them: no new capability would be provided thereby. Contrast with, say, text search, which we integrated mainly because we could provide easier configuration and a better dump/restore experience than the contrib module provided. In both these cases, I can see that the capability could be provided in a different way and benefit from tighter integration. Perhaps. I think a lot of the dump/restore issue could be solved generically if we had better module support ... but there's no need to go over that turf again right now. In the case of citext, I think an integrated solution would involve some way of creating case-insensitive collations, which would certainly be cool but it requires a whole lot of infrastructure we don't have yet. And it wouldn't look even a little bit like the present citext, nor be upward compatible with it. In the case of plproxy, I think an integrated solution is pronounced SQL-MED, and likewise plproxy in its present form doesn't move us toward that goal. An important point here is that acceptance of a feature into core (or even contrib) puts us on the hook to worry about upward compatibility for it, maybe not forever but for a long time into the future. I don't think I want to buy into that for either of these as presently constituted --- they don't match up with what I think the long-term goals ought to be in these areas. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: tuple serialization
On Jul 22, 2008, at 3:04 AM, Markus Wanner wrote: yesterday, I promised to outline the requirements of Postgres-R for tuple serialization, which we have been talking about before. There are basically three types of how to serialize tuple changes, depending on whether they originate from an INSERT, UPDATE or DELETE. For updates and deletes, it saves the old pkey as well as the origin (a global transaction id) of the tuple (required for consistent serialization on remote nodes). For inserts and updates, all added or changed attributes need to be serialized as well. pkey+originchanges INSERT-x UPDATExx DELETEx- Note, that the pkey attributes may never be null, so an isnull bit field can be skipped for those attributes. For the insert case, all attributes (including primary key attributes) are serialized. Updates require an additional bit field (well, I'm using chars ATM) to store which attributes have changed. Only those should be transferred. I'm tempted to unify that, so that inserts are serialized as the difference against the default vaules or NULL. That would make things easier for Postgres-R. However, how about other uses of such a fast tuple applicator? Does such a use case exist at all? I mean, for parallelizing COPY FROM STDIN, one certainly doesn't want to serialize all input tuples into that format before feeding multiple helper backends. Instead, I'd recommend letting the helper backends do the parsing and therefore parallelize that as well. For other features, like parallel pg_dump or even parallel query execution, this tuple serialization code doesn't help much, IMO. So I'm thinking that optimizing it for Postgres-R's internal use is the best way to go. Comments? Opinions? ISTM that both londiste and Slony would be able to make use of these improvements as well. A modular replication system should be able to use a variety of methods for logging data changes and then applying them on a subscriber, so long as some kind of common transport can be agreed upon (such as text). So having a change capture and apply mechanism that isn't dependent on a lot of extra stuff would be generally useful to any replication mechanism. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Transaction-controlled robustness for replication
Hi, very nice proposal and thoughts. Allow me to compare against Postgres-R again. Simon Riggs wrote: Asynchronous commit controls whether we go to disk at time of commit, or whether we defer this slightly. We have the same options with replication: do we replicate at time of commit, or do we defer this slightly for performance reasons. DRBD and other replication systems show us that there is actually another difference when talking about synchronous replication: do we go to disk on the standby before acknowledging the primary? Yeah, I was thinking into the same direction for Postgres-R. There already exist three replication levels: sync, eager and lazy. Having more than just a primary and a standby server in mind, one can also argue about how many remote nodes need to have written the changes to disc, before commit is confirmed in 'sync' mode. At least a majority is required, probably more nodes are wanted. The eager mode is what the original Postgres-R approach is all about and is pretty much the only one I've implemented, at the moment. It only requires confirmation from the GCS, which means at least a majority of the nodes have received the change set (and will be able to apply it). (This leads to a corner case for a full cluster outage, see [1]). In async mode, commit is confirmed before sending the change set to other nodes. If we are able to define these robustness characteristics for each transaction *separately* then it will represent an industry first: Yeah, that would be pretty cool. no other database has that capability implemented or planned on published roadmaps, nor has it been discussed in research to my knowledge. Well, a partial implementation in Postgres-R, if that counts... ;-) Regards Markus [1]: One of the few threads on the Postgres-R-general mailing list: http://pgfoundry.org/pipermail/postgres-r-general/2006-August/02.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Josh Berkus [EMAIL PROTECTED] writes: Tom, Simon, etc.: Of the several things which PostgreSQL could learn from MySQL which we covered at pgCon was that the requirement to hunt hither and yon for popular add-ins is one of the primary reasons for developers not using PostgreSQL. Agreed, but I think the best response to that is something CPAN-like for people to easily get hold of recognized extensions, and next best (or also) a kitchen sink installer package that agglomerates the core and selected outside projects. There aren't any successful extensible projects that ignore their own extensibility and put everything interesting into the core project. The general case aside, I really feel strongly that citext belongs in core unless we come up with some other means to do case-insensitive text. It's one of the top 10 newbie questions. Maybe. I'd be happier about it if I could see a reasonable upgrade path from that to a SQL-spec-compliant solution (ie, something collation based). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: tuple serialization
Hi, Decibel! wrote: ISTM that both londiste and Slony would be able to make use of these improvements as well. A modular replication system should be able to use a variety of methods for logging data changes and then applying them on a subscriber, so long as some kind of common transport can be agreed upon (such as text). So having a change capture and apply mechanism that isn't dependent on a lot of extra stuff would be generally useful to any replication mechanism. Hm.. yeah, that's a good hint. However, I'm not sure how londiste and Slony would interface with these internal methods. That would require some sort of special replication triggers or something. But when to fire them? After every statement (sync)? Just before commit (eager)? After commit (lazy)? (These are the points in Postgres-R, where the internal methods are called). I'm claiming that Postgres-R is modular (enough). But I'm unsure what interface it could provide to the outer world. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Tue, 2008-07-22 at 17:36 -0400, Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: Tom, Simon, etc.: Of the several things which PostgreSQL could learn from MySQL which we covered at pgCon was that the requirement to hunt hither and yon for popular add-ins is one of the primary reasons for developers not using PostgreSQL. Agreed, but I think the best response to that is something CPAN-like for people to easily get hold of recognized extensions, and next best (or also) a kitchen sink installer package that agglomerates the core and selected outside projects. There aren't any successful extensible projects that ignore their own extensibility and put everything interesting into the core project. It seems to me a better solution is to have appropriate repositories for distributions that have them than some cpan style thing that is going to break package dependencies. apt-get install postgresql-plproxy portinstall (I think that is the command) postgresql-plproxy etc... makes more sense. Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Joshua D. Drake [EMAIL PROTECTED] writes: On Tue, 2008-07-22 at 17:36 -0400, Tom Lane wrote: Agreed, but I think the best response to that is something CPAN-like for people to easily get hold of recognized extensions, It seems to me a better solution is to have appropriate repositories for distributions that have them than some cpan style thing that is going to break package dependencies. Better than CPAN is no problem ;-). My point is just that we should exploit PG's extensibility rather than assume that everything interesting must wind up in the core tarball. apt-get install postgresql-plproxy portinstall (I think that is the command) postgresql-plproxy I believe Devrim already has a yum repository up and running for RPM-based distros, though I'm not sure he's got anything but the core packages in it (yet). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Tue, 2008-07-22 at 17:54 -0400, Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: It seems to me a better solution is to have appropriate repositories for distributions that have them than some cpan style thing that is going to break package dependencies. Better than CPAN is no problem ;-). My point is just that we should exploit PG's extensibility rather than assume that everything interesting must wind up in the core tarball. Heh, o.k. :) apt-get install postgresql-plproxy portinstall (I think that is the command) postgresql-plproxy I believe Devrim already has a yum repository up and running for RPM-based distros, though I'm not sure he's got anything but the core packages in it (yet). Well that was certainly part of my point. We have http://www.pgsqlrpms.org/ We also push (a ton) of packages up to EPEL. I also know that Peter has been working on something similar with SuSE and Debian. E.g; in short let's work with respective projects to get these as part of the repositories. Joshua D. Drake regards, tom lane -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Tue, 2008-07-22 at 14:06 -0700, Josh Berkus wrote: Minimalism isn't its own reward. Obviously Tom has reason to worry about the overall maintenance effort for the PostgreSQL code. But we need to balance that against the need to add features that users want and will keep our community growing. Well, minimalistic is a new compliment for me... ;-) Every time we add code to core, the next patch just got bigger since we must always include all aspects of the server features. I want to *increase* the extensibility of Postgres with plugins and APIs. When you mention what we could learn from MySQL, I would say introduce pluggable extensibility in more places. Solving the putting the pieces back together problem is a somewhat easier problem than trying to maintain a whole spittoon full of (cool) extensions in core. Do you want Tom to a) spend a month improving the optimizer b) get him to review already working code so we can package things It's a question of priorities. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Plans for 8.4
On Tue, 2008-07-22 at 16:54 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Tue, 2008-07-22 at 18:37 +0200, Markus Wanner wrote: What is it you are talking about? Reconstructing SQL from WAL, allowing logical apply. ... I'm proposing to put the mechanisms in place to allow WAL to be used by other replication projects. Which turn out to be relatively small changes to the main server. [ retrieves eyebrows from ceiling... ] I doubt that's even possible, let alone a relatively small change. I'm talking about building the transport layer into Postgres core only. SMOP outside of core, with the right changes. There's always potential blockers to any problem. We'll see when I publish the design. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transaction-controlled robustness for replication
On 7/22/08, Simon Riggs [EMAIL PROTECTED] wrote: We could represent this with 3 parameters: synchronous_commit = on | off synchronous_standby_transfer = on | off synchronous_standby_wal_fsync = on | off If we are able to define these robustness characteristics for each transaction *separately* then it will represent an industry first: no other database has that capability implemented or planned on published roadmaps, nor has it been discussed in research to my knowledge. Changing the parameter setting at transaction-level would be expensive if we had to set three parameters. How about extending BEGIN.with additional keywords? -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transaction-controlled robustness for replication
On Wed, 2008-07-23 at 01:39 +0300, Marko Kreen wrote: On 7/22/08, Simon Riggs [EMAIL PROTECTED] wrote: We could represent this with 3 parameters: synchronous_commit = on | off synchronous_standby_transfer = on | off synchronous_standby_wal_fsync = on | off If we are able to define these robustness characteristics for each transaction *separately* then it will represent an industry first: no other database has that capability implemented or planned on published roadmaps, nor has it been discussed in research to my knowledge. Changing the parameter setting at transaction-level would be expensive if we had to set three parameters. How about extending BEGIN.with additional keywords? SQL Standard, plus many interfaces hide BEGIN from you. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: tuple serialization
On Jul 22, 2008, at 4:43 PM, Markus Wanner wrote: Decibel! wrote: ISTM that both londiste and Slony would be able to make use of these improvements as well. A modular replication system should be able to use a variety of methods for logging data changes and then applying them on a subscriber, so long as some kind of common transport can be agreed upon (such as text). So having a change capture and apply mechanism that isn't dependent on a lot of extra stuff would be generally useful to any replication mechanism. Hm.. yeah, that's a good hint. However, I'm not sure how londiste and Slony would interface with these internal methods. That would require some sort of special replication triggers or something. But when to fire them? After every statement (sync)? Just before commit (eager)? After commit (lazy)? (These are the points in Postgres-R, where the internal methods are called). Currently, londiste triggers are per-row, not deferred. IIRC, londiste is the same. ISTM it'd be much better if we had per- statement triggers that could see what data had changed; that'd likely be a lot more efficient than doing stuff per-row. In any case, both replication systems should work with either sync or eager. I can't see them working with lazy. What about just making all three available? I'm claiming that Postgres-R is modular (enough). But I'm unsure what interface it could provide to the outer world. Yeah. I suspect that Postgres-R could end up taking the place of the replica-hooks mailing list (and more, of course). -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS][PATCHES] odd output in restore mode
Below my comments on the CommitFest patch: pg_standby minor changes for Windows Simon, I'm sorry you got me, a Postgres newbie, signed up for reviewing your patch ;) To start with, I'm not quite sure of the status of this patch since Bruce's last comment on the -patches alias: Bruce Momjian wrote: OK, based on these observations I think we need to learn more about the issues before making any changes to our code. From easy to difficult: 1. Issues with applying the patch to CVS HEAD: The second file in the patch Index: doc/src/sgml/standby.sgml appears to be misnamed -- the existing file in HEAD is Index: doc/src/sgml/pgstandby.sgml However, still had issues after fixing the file name: [EMAIL PROTECTED]:~/pg/pgsql$ patch -c -p0 ../pg_standby.patch patching file contrib/pg_standby/pg_standby.c patching file doc/src/sgml/pgstandby.sgml Hunk #1 FAILED at 136. Hunk #2 FAILED at 168. Hunk #3 FAILED at 245. Hunk #4 FAILED at 255. 4 out of 4 hunks FAILED -- saving rejects to file doc/src/sgml/pgstandby.sgml.rej 2. Missing description for new command-line options in pgstandby.sgml Simon Riggs wrote: Patch implements * recommendation to use GnuWin32 cp on Windows Saw that in the changes to pgstandby.sgml, and looks ok to me, but: - no description of the proposed new command-line options -h and -p? 3. No coding style issues seen Just one comment: the logic that selects the actual restore command to be used has moved from CustomizableInitialize() to main() -- a matter of personal taste, perhaps. But in my view the: + the #ifdef WIN32/HAVE_WORKING_LINK logic has become easier to read 4. Issue: missing break in switch, silent override of '-l' argument? This behaviour has been in there before and is not addresses by the patch: The user-selected Win32 mklink command mode is never applied due to a missing 'break' in CustomizableInitialize(): switch (restoreCommandType) { case RESTORE_COMMAND_WIN32_MKLINK: SET_RESTORE_COMMAND(mklink, WALFilePath, xlogFilePath); case RESTORE_COMMAND_WIN32_COPY: SET_RESTORE_COMMAND(copy, WALFilePath, xlogFilePath); break; A similar behaviour on Non-Win32 platforms where the user-selected ln may be silently changed to cp in main(): #if HAVE_WORKING_LINK restoreCommandType = RESTORE_COMMAND_LN; #else restoreCommandType = RESTORE_COMMAND_CP; #endif If both Win32/Non-Win32 cases reflect the intended behaviour: - I'd prefer a code comment in the above case-fall-through, - suggest a message to the user about the ignored ln / mklink, - observe that the logic to override of the '-l' option is now in two places: CustomizableInitialize() and main(). 5. Minor wording issue in usage message on new '-p' option I was wondering if the always in the usage text fprintf(stderr, -p always uses GNU compatible 'cp' command on all platforms\n); is too strong, since multiple restore command options overwrite each other, e.g. -p -c applies Windows's copy instead of Gnu's cp. 6. Minor code comment suggestion Unrelated to this patch, I wonder if the code comments on all four time-related vars better read seconds instead of amount of time: int sleeptime = 5; /* amount of time to sleep between file checks */ int holdtime = 0; /* amount of time to wait once file appears full */ int waittime = -1; /* how long we have been waiting, -1 no wait * yet */ int maxwaittime = 0;/* how long are we prepared to wait for? */ 7. Question: benefits of separate holdtime option from sleeptime? Simon Riggs wrote: * provide holdtime delay, default 0 (on all platforms) Going back on the hackers+patches emails and parsing the code comments, I'm sorry if I missed that, but I'm not sure I've understood the exact tuning benefits that introducing the new holdtime option provides over using the existing sleeptime, as it's been the case (just on Win32 only). 8. Unresolved question of implementing now/later a cp replacement Simon Riggs wrote: On Tue, 2008-07-01 at 13:44 +0300, Heikki Linnakangas wrote: This seems pretty kludgey to me. I wouldn't want to install GnuWin32 utilities on a production system just for the cp command, and I don't know how I would tune holdtime properly for using copy. And it seems risky to have defaults that are known to not work reliably. How about implementing a replacement function for cp ourselves? It seems pretty trivial to do. We could use that on Unixes as well, which would keep the differences between Win32 and other platforms smaller, and thus ensure the codepath gets more testing. If you've heard complaints about any of this from users, I haven't. AFAIK we're doing this because it *might* cause a problem. Bear in mind that link is the preferred performance option, not copy. So AFAICS we're tuning a secondary option on one specific port, without it being
Re: [HACKERS] Transaction-controlled robustness for replication
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Simon Riggs wrote: Asynchronous commit controls whether we go to disk at time of commit, or whether we defer this slightly. We have the same options with replication: do we replicate at time of commit, or do we defer this slightly for performance reasons. DRBD and other replication systems show us that there is actually another difference when talking about synchronous replication: do we go to disk on the standby before acknowledging the primary? We can generalise this as three closed questions, answered either Yes (Synchronous) or No (Asynchronous) * Does WAL get forced to disk on primary at commit time? * Does WAL get forced across link to standby at commit time? * Does WAL get forced to disk on standby at commit time? * Does WAL get applied [and synced] to disk on standby at commit time? This is important if you want to use the standby as a read-only. I am slightly confused about what the fsync setting does to all this, hence the brackets. I think that questions 2 and 3 are trivially bundled together. Once the user can specify 2, implementing 3 should be trivial and vice versa. I am not even convinced that these need to be two different parameters. Also please note that an answer of yes to 3 means that 2 must also be answered yes. We could represent this with 3 parameters: synchronous_commit = on | off synchronous_standby_transfer = on | off synchronous_standby_wal_fsync = on | off synchronous_standby_apply = on | off# just to propose a name Changing the parameter setting at transaction-level would be expensive if we had to set three parameters. What exactly does expensive mean? All three parameters can probably be set in one TCP packet from client to server. Or we could use just a single parameter synchronous_commit = 'AAA', 'SAA', 'SSA', 'SSS' or on |off when no log-based replication is defined Having the ability to set these at the transaction-level would be very cool. Having it set via a *single* parameter would make it much more viable to switch between AAA for bulk, low importance data and SSS for very low volume, critical data, or somewhere in between on the same server, at the same time. The problem with a single parameter is that everything becomes position dependent and if whyever a new parameter is introduced, it's not easy to upgrade old application code. So proposal in summary is * allow various modes of synchronous replication for perf/robustness * allow modes to be specified per-transaction * allow modes to be specified as a single parameter How about creating named modes? This would give the user the ability to define more fine-grained control especially in larger clusters of fail-over/read-only servers without totally clogging the parameter space and application code. Whether this should be done SQL-style or in some config file is not so clear to me, although I'd prefer SQL-style like CREATE SYNCHRONIZING MODE immediate_readonly AS LOCALSYNCHRONOUS APPLY 192.168.0.10 SYNCHRONOUS APPLY-- read-only slave 192.168.0.11 SYNCHRONOUS APPLY-- read-only slave 192.168.0.20 SYNCHRONOUS SHIP -- backup-server 192.168.0.21 SYNCHRONOUS SHIP -- backup-server 192.168.0.30 SYNHCRONOUS FSYNC-- backup-server with fast disks ; and then something like synchronize_mode = immediate_readonly; Yeah, I know, give patches not pipe-dreams :) Regards, Jens-Wolfhard Schicke -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFIhoAdzhchXT4RR5ARAo/6AJ9R6LA0TsPvD/TBy6Bh1q0q5JvyKQCbBycx 1CKc8dqxnlvmH/hbi1Px+v8= =l5P4 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH]-hash index improving
I'm sorry for delay reply. I couldn't get access to the internet these days for some reason. I do apologize for my rough work and very bad readability. I posted it in a hurry and I didn't mean to cause the reader so much inconvenience. I'll NEVER make such a mistake again. Currently, I've made some optimization Tom advised and removed the macro HASHVALUE_ONLY. And I'm working on fixing the problem that it crashed in large data set. I'll post a new patch later. Thank you for all your advice and test. -- Best Regards, Xiao Meng DKERC, Harbin Institute of Technology, China Gtalk: [EMAIL PROTECTED] MSN: [EMAIL PROTECTED] http://xiaomeng.yo2.cn -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH]-hash index improving
Well, I'll do it after I finish my second patch. Hash index should be more efficient than btree when N is big enough. It seems meaningful to find how big N is in an experiment way. On Fri, Jul 18, 2008 at 6:35 PM, Simon Riggs [EMAIL PROTECTED] wrote: On Fri, 2008-07-18 at 11:07 +0100, Gregory Stark wrote: Simon Riggs [EMAIL PROTECTED] writes: hash lookups can in theory be O(1). I'm not sure whether that applies here? I'm interested in how *this* patch will work, not in more generic algorithm theory. To patch authors: Can we please see a table showing expected number of logical I/Os (i,e, block accesses) for btrees and hash indexes e.g. for 100-byte rows... rowsbtree hash - 10^2 10^3 10^4 10^5 10^6 10^7 10^8 -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Best Regards, Xiao Meng DKERC, Harbin Institute of Technology, China Gtalk: [EMAIL PROTECTED] MSN: [EMAIL PROTECTED] http://xiaomeng.yo2.cn -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Tue, 2008-07-22 at 17:54 -0400, Tom Lane wrote: apt-get install postgresql-plproxy portinstall (I think that is the command) postgresql-plproxy I believe Devrim already has a yum repository up and running for RPM-based distros, though I'm not sure he's got anything but the core packages in it (yet). I have about 50 packages there, and I do package many pgfoundry projects, like plproxy, pgsphere, pgpool, orafce, plpgpsm, table_log, etc. -- Devrim GÜNDÜZ devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Joshua D. Drake [EMAIL PROTECTED] writes: On Tue, 2008-07-22 at 17:54 -0400, Tom Lane wrote: I believe Devrim already has a yum repository up and running for RPM-based distros, though I'm not sure he's got anything but the core packages in it (yet). Well that was certainly part of my point. We have http://www.pgsqlrpms.org/ ... E.g; in short let's work with respective projects to get these as part of the repositories. There's a limit to how far you can go there, because just about any distro (other than maybe Gentoo) is going to be resistant to dropping in bleeding-edge versions. *Especially* code that's not 99.44%+ compatible with what's in their current releases. To take the example I'm most closely familiar with: sure I can put the latest and greatest into Fedora rawhide, but that has approximately zip to do with what people are running in the field. As soon as a Fedora release happens, I'm constrained by compatibility issues as to what I can put into that branch. RHEL releases ten times more so. I gather that Debian, for instance, is even more paranoid than Red Hat about upstream version bumps. So I think the real-world situation is that we have to make stuff available to people who want to run something newer/different from what their chosen distro ships. That means providing our own repo. Certainly I've got no problem with pushing stuff to the official distros as fast as we can, but you've got to realize that that's gonna be a slow process, and necessarily always out of date for any distro version that is actually popular in the field. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH]-hash index improving
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Xiao Meng Sent: Tuesday, July 22, 2008 7:57 PM To: Simon Riggs Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [PATCH]-hash index improving Well, I'll do it after I finish my second patch. Hash index should be more efficient than btree when N is big enough. It seems meaningful to find how big N is in an experiment way. The savings will depend on many factors. Another thing (besides volume) which is important is the sort of key data being indexed. Consider a unique key on six varchar(40) fields: 1. Country 2. State/Province 3. City 4. Division 5. Branch 6. Office Typically, a key such as this will have lots of clumps of similar data, only being differentiated with the final segment. This sort of index is often used for reporting purposes. To determine a unique entry, it is not unlikely that more than 200 characters will be traversed. A hash index gets a special boost here because a much smaller data signature is stored. Even a 64 bit hash occupies only 8 bytes. On the other hand, consider an index on a field consisting of a single character. Here, the pages of the b-tree will have a huge volume of entries per page, requiring fewer pages to search, and the hash index is many times larger and hence more pages will have to be loaded. These things make a hash index desirable: 1. Unique index 2. Long keys 3. Huge data cardinality 4. Equality search These things make a hash index undesirable: 1. Non-unique index 2. Short keys 3. Small data sets These things render a hash index as worthless (except in COMBINATION with a b-tree type index): 1. Need for range searches like BETWEEN 2. Need for ORDER BY on the column(s) As an aside: I guess it will also be nice if you can CLUSTER both index and data values on the hash. It may need a different algorithm than a b-tree clustering concept. I know that Rdb uses different kinds of storage areas for hashed indexes verses b-tree indexes. This effort to create hashed indexes is very valuable. Because it becomes more and more dominant as the data scales up, right at the point when things get tougher is when it becomes the most helpful. If you have a tiny table, it does not even matter if you index it, because (for instance) 10 rows will probably always stay in memory and iteration will find what is needed instantly. But if you have hundreds of millions of rows or billions of rows, now is when performance really matters. So when the data scales to preposterous size (which it has an uncanny ability to do) the boost of performance becomes even more valuable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Tue, 2008-07-22 at 23:29 -0400, Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: On Tue, 2008-07-22 at 17:54 -0400, Tom Lane wrote: I believe Devrim already has a yum repository up and running for RPM-based distros, though I'm not sure he's got anything but the core packages in it (yet). Well that was certainly part of my point. We have http://www.pgsqlrpms.org/ ... E.g; in short let's work with respective projects to get these as part of the repositories. There's a limit to how far you can go there, because just about any distro (other than maybe Gentoo) is going to be resistant to dropping in bleeding-edge versions. Certainly. *Especially* code that's not 99.44%+ compatible with what's in their current releases. To take the example I'm most closely familiar with: sure I can put the latest and greatest into Fedora rawhide, but that has approximately zip to do with what people are running in the field. We could have a quality committee? Something that says, These 5 packages are considered stable by PGDG. Those go into the various repositories whether published directly to STABLE (or equiv) or are put into something like Universe. So I think the real-world situation is that we have to make stuff available to people who want to run something newer/different from what their chosen distro ships. That means providing our own repo. Yes that is what pgsqlrpms is. Certainly I've got no problem with pushing stuff to the official distros as fast as we can, but you've got to realize that that's gonna be a slow process, and necessarily always out of date for any distro version that is actually popular in the field. I should note that my point is about using proper package formats first, working with distros second. I am under no illusion that we will likely have to have our own repos (which is one of the reasons we have pgsqlrpms). The good news is, we have the beginnings of this already for at least three major distros. It should be relatively trivial to work with macports, fink and freebsd. I am sure the Open Solaris group would be more than happy to as well. Sincerely, Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Joshua D. Drake [EMAIL PROTECTED] writes: On Tue, 2008-07-22 at 23:29 -0400, Tom Lane wrote: There's a limit to how far you can go there, because just about any distro (other than maybe Gentoo) is going to be resistant to dropping in bleeding-edge versions. We could have a quality committee? Something that says, These 5 packages are considered stable by PGDG. Those go into the various repositories whether published directly to STABLE (or equiv) or are put into something like Universe. I don't think you got the point: such pronouncements would have exactly zero influence on Red Hat, or any other distro I'm familiar with. The *assumption* is that upstream thinks their new release is stable, else they wouldn't have made it. The distros are in the business of not believing that, until more proof emerges --- preferably from their own testing. I know that this is the mind-set at Red Hat, and I'm pretty sure SUSE and Debian work the same way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
On Wed, 2008-07-23 at 00:01 -0400, Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: On Tue, 2008-07-22 at 23:29 -0400, Tom Lane wrote: There's a limit to how far you can go there, because just about any distro (other than maybe Gentoo) is going to be resistant to dropping in bleeding-edge versions. I actually think we are talking past each other. I know how distros work, all to well frankly. Our repos would be unofficial in the Redhat eye. My point is, the Red Hat eye is irrelevant for a project like this. Those who are going to confine themselves to that ideal are a lost cause (for this project). They will run ancient versions of PostgreSQL and that's cool because they feel they can trust it. On the other hand, those who need 8.3 (on RHEL4 for example) can get it, now -- without breaking compatibility and with RPM. Sincerely, Joshua D. Drake regards, tom lane -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres-R: tuple serialization
Hi, Decibel! wrote: Currently, londiste triggers are per-row, not deferred. IIRC, londiste is the same. ISTM it'd be much better if we had per-statement triggers that could see what data had changed; that'd likely be a lot more efficient than doing stuff per-row. Well, now that I think about it... there might be *lots* of changes. Certainly something you don't want to collect in memory. At the moment, Postgres-R cannot handle this, but I plan to add an upper limit on the change set size, and just send it out as soon as it exceeds that limit, then continue collecting. (Note for the GCS adept: this partial change set may be sent via reliable multicast, only the very last change set before the commit needs to be totally ordered.) That would mean, introducing another 'change set full' hook... In any case, both replication systems should work with either sync or eager. I can't see them working with lazy. Huh? AFAIK, londiste as well as Slony-I are both async. So what would hooks for sync replication be good for? Why not rather only lazy hooks? (Well, lazy hooks will pose yet another problem: those theoretically need to run somewhen *after* the commit, but at that time we don't have an open transaction, so where exactly shall we do this?) What about just making all three available? Doh. Ehm. That's a lot of work for something we are not even sure it's good for anything. I'm certainly willing to help. And if other projects show enough interest, I might even add the appropriate triggers myself. But as long as this is all unclear, I certainly have more important things on my todo list for Postgres-R (see that TODO list ;-) ). Yeah. I suspect that Postgres-R could end up taking the place of the replica-hooks mailing list (and more, of course). Let's hope so, yeah! Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers