Re: [HACKERS] get rid of psql welcome message
Tom Lane wrote: Well, in general the *variable* parts of the banner were all put there because of fairly urgent need, and I'd resist removing them. It's the unchanging boilerplate that seems open to debate. I'm +1 for cutting that down to a single line. I don't care one way or the other about providing a .psqlrc option to suppress it altogether. It could be that even optional removal of the version number is a foot-gun for users who perhaps carelessly lose track of which version they are running and do something with it (such as rsync with another server's data dir or something silly like that) expecting the wrong version. I don't see how, if it were reduced to a single line, the indication of version number could possibly be considered problematic under any circumstances. Regards, - Naz. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timezone view
Tom Lane wrote: Naz Gassiep [EMAIL PROTECTED] writes: I think that it would be great if the pg_timezone_names and pg_timezone_abbrevs included a boolean field indicating if that item is in the Olsen DB Huh? They're all in the Olsen DB Not true, the zone.tab file has 398 zones defined, and in my PG 8.2 running on Debian, there are 564 timezones listed in pg_timezone_names. The field I propose would indicate which 398 of those 564 are listed in zone.tab as those are the actual Olsen defined timezones, the rest are aliases. Regards, - Naz. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Timezone view
Alvaro Herrera wrote: Naz Gassiep wrote: It may also be beneficial to add the ISO 3166 column into that view, the data is in zone.tab and I can't see a reason to not include it. We also have the country name in iso3166.tab and the geo coordinates. And there is also a comment field. Which ones make sense to add I'd say the country name and geo coordinates both make sense to add. Regards, - Naz. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Timezone view
I brought this up a while ago, but I didn't get any responses, I assume due to everyone being too busy with 8.3 I think that it would be great if the pg_timezone_names and pg_timezone_abbrevs included a boolean field indicating if that item is in the Olsen DB or if it is a system alias or other added item. This would make it far easier to integrate the data in the view with external data sources that also use the Olsen DB. It may also be beneficial to add the ISO 3166 column into that view, the data is in zone.tab and I can't see a reason to not include it. - Naz. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Data from zone.tab
Is there any reason that the zone.tab information is not included in the pg_timezone_names system view? ISTM that there is really no reason not to, as that view is really populated using that file anyway. There is a 1:1 mapping (assuming the aliases are mapped to the zone.tab entries they are aliases of) of entries in that view with enties in zone.tab. Reading an earlier thread on this matter, I think Magnus is behind the code that generates the view. What are the chances of getting at least the country code included in the pg_timezone_names system view? It'd really help out with i18n / L10n work, and given that PG already ships with that data present, it seems silly to not take advantage of it given how easy it would be to do so. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Data from zone.tab
Sorry to reply, but there should also be a field in the system view is_alias so that devs are able to tell which zone names are in the zone.tab file and which are not. That way a perfect 1:1 mapping between zone.tab and app can be made. If this were done then it'd make things like using CLDR data and other standardized data sources easier, as you could be confident that all timezone names matched the data in the CLDR. I think what I'm trying to say is that using and applying standards is a good thing. - Naz. Naz Gassiep wrote: Is there any reason that the zone.tab information is not included in the pg_timezone_names system view? ISTM that there is really no reason not to, as that view is really populated using that file anyway. There is a 1:1 mapping (assuming the aliases are mapped to the zone.tab entries they are aliases of) of entries in that view with enties in zone.tab. Reading an earlier thread on this matter, I think Magnus is behind the code that generates the view. What are the chances of getting at least the country code included in the pg_timezone_names system view? It'd really help out with i18n / L10n work, and given that PG already ships with that data present, it seems silly to not take advantage of it given how easy it would be to do so. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Spoofing as the postmaster
The problem with forcing authentication is that an auth-unaware client connecting to a legitimate postmaster would have its connections refused. That same client would have its connections accepted by an impostor postmaster. Thus, there is no way to stop impostor postmasters from carrying out these attacks on auth-unaware clients. The proper solution, as I see it, would be to have an authentication system in the postmaster that was not enforced. If the client requests authentication, postmaster will provide it, if not, then postmaster will connect normally without it. This would not result in *any* change in the default behavior of postmaster, and as far as users who don't want to use it are concerned, they don't even need to bother to turn it off (assuming that having it turned on does not consume extra resources and I don't think having an unused authentication mechanism sitting in the postmaster connection establishment routine would). This does not appear to result in greater security, however it does. It allows DBAs who suspect that they are likely going to be the target of these attacks to deploy authentication procedures in their client packages. This could be a modification to their applications, or whatever steps are necessary to mandate authenticated connections within their organization. There is no point forcing some auth mechanism within postmaster, as attackers would simply catch users using software that did not require the server to auth before sending passwords. For this reason it is not postmaster's responsibility to check that unknown clients do not connect to impostors, it is postmaster's responsibility however to authenticate itself, if the client asks for it. So the onus (rightfully in my opinion) falls upon network administrators / DBAs to ensure that all of their users are using auth-enabled client packages which will not allow connections to be established with a postmaster until authentication has passed, and disallow the use of other client software to connect to postmaster. In my view, this puts the security responsibility where it rightfully belongs *and* maintains a non-breaking of client packages in the wild. Making a server or anything that *requires* auth and disallows non-authed clients is pointless, as there is nothing stopping attackers from setting up an auth-disabled impostor and waiting for someone to just connect using psql or some other vanilla connection method. The onus really ought to be with the administrators who give their users the software they use to connect to ensure that the software they use adheres to the relevant security policy, in the same way that its their responsibility to ensure that the client software does not contain keyloggers and other such trashware. In the web world, it is the client's responsibility to ensure that they check the SSL cert and don't do their banking at www.bankofamerica.hax0r.ru and there is nothing that the real banking site can do to stop them using their malware infested PC to connect to the phishing site. They can only provide a site that provides authentication. This is analogous to postmaster: It is only the responsibility of postmaster to provide the option of authentication, it is the client's responsibility to know if they should use it, and if so, to ensure they do so properly. Regards, - MrNaz.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] todo: Hash index creation
Wow... not sure how I missed that. I *did* create this schema ages ago, perhaps it wasn't there, or at the time I had no idea what the implications were. *shrug* Regards, - Naz. Tom Lane wrote: Naz Gassiep [EMAIL PROTECTED] writes: As a result, when creating tables containing large blocks of text I wish to index, I've been using HASH as an index method. Please can we state in the manual that HASH index types are in a beta stage of development or something similar, or perhaps remove the manual entry altogether until HASH is at a point where it is usable in production. Uh, the manual already does say Note: Testing has shown PostgreSQL's hash indexes to perform no better than B-tree indexes, and the index size and build time for hash indexes is much worse. Furthermore, hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. For these reasons, hash index use is presently discouraged. under 11.2 Index Types, as well as various derogatory remarks elsewhere. regards, tom lane
Re: [HACKERS] todo: Hash index creation
Actually I think the *most* important thing to work on is to get hash to the point where its search speed actually beats btree consistently, so that it has an excuse to live. If that is insoluble we might well end up ripping it out entirely. (The first three TODO items for hash indexes are ideas for trying to improve the speed.) Fixing the WAL support would come after that, and bring it to the point where someone could actually recommend it for production use. After that it would be sensible to work on inessentials like improving the build speed. I've been warned away from hash indexes before, however I had no idea that it's performance was that abysmal that BTREE beat it and I was definitely not aware that they were not included in WAL logs. I was told it wasn't as good as it could be, but I wasn't told it was pretty much an alpha piece of code. As a result, when creating tables containing large blocks of text I wish to index, I've been using HASH as an index method. Please can we state in the manual that HASH index types are in a beta stage of development or something similar, or perhaps remove the manual entry altogether until HASH is at a point where it is usable in production. Regards, A very surprised n00b. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_get_tabledef
Just a question, is there any advantage to having this then building a function in applications that wrap and use pg_dump with a few options? Surely that's a more appropriate way to achieve this functionality? - Naz. Usama Munir wrote: Hi, i was following a thread some time ago where adding a function *pg_get_tabledef* was one of the TODOs for 8.2, but it somehow didn't make it to the release perhaps because the functionality was not clearly defined? not sure. Anyway i happen to come up with a function for pg_get_tabledef which works something like following /postgres=# select pg_get_tabledef(16388) ; pg_get_tabledef CREATE TABLE public.dept ( deptno numeric(2,0) NOT NULL, dname character varying(14), loccharacter varying(13) ) WITHOUT OIDS; /(1 row) i wanted to submit a patch for this, IFF the community wants this function. The rationale is obviously to help Application developers writing applications like pgAdmin. Currently this part of SQL needs to be constructed manually for postgres by the tools. it is arguable that a table defintion will have constraints , triggers etc as well, and they can be added without much problem, but i think if a tool needs to construct an SQL for all table related objects then functions are already available for them like pg_get_constraintdef, pg_get_ruledef, pg_get_indexdef, pg_get_triggerdef etc i understand that you guys don't like someone to develop a patch in isolation and just come up with it one day, but it really came out as a by-product of some other work , and i thought you guys might be interested in it. if it is desired, i will submit a patch for it, within a day or so. Regards, Usama Munir EnterpriseDB (www.enterprisedb.com) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Feature freeze progress report
Andrew Dunstan wrote: Naz Gassiep wrote: I believe the suggestion was to have an automated process that only ran on known, sane patches. How do we know in advance of reviewing them that they are sane? Same way as happens now. I would assume this mechanism would only be applied to patches that had already been approved to contrib, or some other measure that can be used to isolate only those patches that we *expect* to already be working. The intention of this mechanism, in my head, is to just help us make sure that regression issues on patches get detected sooner. What is more, we often run into situations where patch a will require changes in patch b, so testing them individually against CVS is not likely to be terribly useful. Yeap, given that this proposition is for an automated system, perhaps it could be designed to apply combinations of patches together to look for conflicts. Frankly, our problems are not primarily technological. They have to do mainly with scarcity of available time from competent reviewers. No amount of automation will fix that. I fully understand that. However I find the idea of an automated process checking for big issues while we're all sleeping to be... sexy. I'm not sure how difficult a system like this would be to set up but it doesn't seem to me to be the sort of thing that requires more than a few simple scripts. If it's not too had to set up, even if it only yields small and rare benefits, it will have been a worthwhile exercise. My 2c (adjusted for inflation). Regards, - Naz ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Feature freeze progress report
What is approved to contrib? The problem here is that having reasonable certainty that a patch is not malicious requires having gone over it in some detail; at which point you might as well apply the thing. Or if you didn't apply it, you bounced it for reasons that are unlikely to have anything to do with needing more automated testing. ISTM this idea can only work if we have a second tier of reviewers who are considered good enough to vet patches as safe, but not quite good enough to certify them as commitable. I'm not seeing a large pool of people volunteering to hold that position --- at best it'd be a transitory state before attaining committerdom. If you are relying on a constant large influx of new people, you are doomed to failure (see Ponzi scheme for a counterexample). Yep. For the record, Ponzi died in poverty, so it's not a counter example, just proves that any gains that are had will be short lived and increase the size of the crash when crunch time comes. :) - Naz. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Feature freeze progress report
I believe the suggestion was to have an automated process that only ran on known, sane patches. I don't think he was suggesting a mechanism for the great unwashed masses to dump arbitrary code into and have it applied in the buildfarm. You'd have an inventory of patches (you could use a hash to ensure they hadn't changed just before they ar automatically applied) that were verified as good, and the system would apply them to HEAD periodically. Even if the patch inventory wasn't kept right up to date, this system could potentially help many regression issues or bugs to surface sooner, and as it would require zero work once set up besides system maintenance (which should be low if it is implemented in a reasonably intelligent manner) I feel that it is a great idea. Generally, I am all for automating mundane tasks as much as possible. Regards, - Naz. Andrew Dunstan wrote: Marc Munro wrote: On Mon, 2007-30-04 at 08:56 -0300, Heikki Linnakangaspgsql wrote: Date: Mon, 30 Apr 2007 09:18:36 +0100 From: Heikki Linnakangas [EMAIL PROTECTED] To: Tom Lane [EMAIL PROTECTED] Cc: Dave Page [EMAIL PROTECTED], Simon Riggs [EMAIL PROTECTED], Bruce Momjian [EMAIL PROTECTED], PostgreSQL-development pgsql-hackers@postgresql.org Subject: Re: Feature freeze progress report Message-ID: [EMAIL PROTECTED] If we had a 1-2 lines status blurp attached to each patch in the queue, like waiting for review, author is fixing issue XX, etc., that might help. Bruce would need to do that if we keep the current patch queue system unmodified otherwise, or we'd need to switch to something else. Would it be possible to also automatically determine some sort of bit-rot status? What I had in mind was an automated process that would apply each patch to HEAD on a daily basis and report whether the patch still applies cleanly and still allows all regression tests to pass on at least one platform. If and when the result of these tests changes from pass to fail, the patch submitter would be automatically notified. The patch status could then also show the last time at which the patch applied cleanly, and the last time that regression tests ran successfully. This or something similar has been discussed in the past w.r.t. the buildfarm. One major problem is that most sane system owners won't want to apply, compile and run an arbitrary patch. It could well have an intended or unintended trojan horse, for example. So you'd need some level of sanity checking to be done by some trusted person even to get it to this stage. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] tsearch2 in 8.3
A few of us on IRC were wondering what the status of tsearch2 is in 8.3 ? Was it decided to include it in core or did we decide to keep FTS as a plugin? Some brief comments from anyone on the inside of the whole FTS issue would be greatly appreciated by us mere end users. Regards, - Naz. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Daylight Saving Time question PostgreSQL 8.1.4
Granted, but a configure switch would allow users who want to use OS TZ file in conjunction with a compiled from source installation. Many users of OSes with package managers such as Debian or RedHat may, for whatever reason, want to use a source tarball to install and also use the OS TZ list. That being said, this user group may be small enough to ignore. Just throwing it in for thought. Tom Lane wrote: Josh Berkus josh@agliodbs.com writes: Zdenec, I have following idea: 1) add guc varibale which enable usage of OS time zone files 2) add extra parameters into ./configure script which enable OS TZ support in the code and get path to OS TZ files. If we're adding it as a configure-time variable, there's no reason to have a GUC. I see zero reason to have either. It would only make sense to do this in the context of a platform-specific distribution such as an RPM, and in that context the simplest solution is to let the RPM specfile make the substitution (ie, after "make install" and before packaging, rm -rf PG's timezone tree and insert a symlink). Then it's on the RPM packager's head whether it's the right thing to do or not. A configure switch strikes me as mostly a foot-gun, because the average user of Postgres won't have any way to know whether the files are compatible. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PostgreSQL - 'SKYLINE OF' clause added!
I do see your points regarding the existence of use cases for this feature, and I agree that at worst, the implementation of this feature would provide a way to greatly simplify query design and at best provide a whole new method of obtaining decision supporting data from a relational database. However I am strongly in disagreement with your fourth point, I.e., that users will only become aware of it once it has been implemented. This sort of mentality is what gave us the sad case of late 90s HTML in which browser vendors assumed that they could use the if you build it they will come argument for feature extension of the HTML spec. That is a debacle we are still suffering the effects of. Let us not do the same to SQL and implement SKYLINE on our own, only to have other DBMS vendors implement it in different ways and then finally when the SQL standard includes it they try to make some kind of average approximation of the implementations resulting in *none* of the DBs being compliant. Then we'll be between the rock of breaking backwards compatibility and the hard place of unwarranted standards non-compliance. While Josh did point out that being in the leading group as far as implementing new functionality goes, I feel that it has to be weighed against the need to not strike out too aggressively, potentially isolating ourselves with excessive non-standard syntax or behavior. While I am convinced there is a strong use case for this functionality and we should definitely start looking at it, I don't see why we should be in a rush to get it into core. People have survived without it up to now, I don't think our userbase will suffer if it is implemented 6 months after foo commercial DB implements it, at least, not as much as it will suffer if we start drifting away from standards compliance. Just my 2 rupees. :) - Naz Nikita wrote: Few things from our side: 1. 'Skyline Of' is a new operator proposed in ICDE 2003, one of the topmost conferences of Data Engineering. Skyline operation is a hot area of research in query processing. Many of the database community people do know about this operator, and it is fast catching the attention. 2. The skyline operation is very useful in data analysis. Suppose, if we have a cricket database, and we want to find the bowlers who have taken maximum wickets in minimum overs, we can issue an easy-to-write query using 'Skyline of' syntax as follows: Select * from Player_Match Skyline Of overs_bowled min, wickets_taken max; This query gives 25 interesting tuples (result set) out of 24750 tuples in 0.0509 seconds. The same result is obtained in 0.8228 seconds if the following equivalent nested-query is issued: select * from Player_Match p1 where not exists ( select * from Player_Match p2 where p2.overs_bowled = p1.overs_bowled and p2.wickets_taken = p1.wickets_taken and (p2.overs_bowled p1.overs_bowled or p2.wickets_taken p1.wickets_taken)) Note that the above time is the time elapsed between issuing a query and obtaining the result set. As can be seen, the above query looks pretty cumbersome to write and is inefficient too. So, which query will the user prefer? As the number of dimensions increases, writing a nested-query will become a hedious task. Btw, how can such a query be written using aggregate function syntax?? 3. As far as optimizing the Skyline is concerned, it is still a research problem since it requires estimating the cardinality of the skyline result set. 4. Until and unless this operator is implemented in a popular database system, how can a user ever get to know about it and hence appreciate its usefulness? Btw, it was our B.Tech final year project, and not a term project :-) Regards. On 3/8/07, *Tom Lane* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Shane Ambler [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] writes: Tom Lane wrote: Well, whether it's horrible or not is in the eye of the beholder, but this is certainly a non-standard syntax extension. Being non-standard should not be the only reason to reject a worthwhile feature. No, but being non-standard is certainly an indicator that the feature may not be of widespread interest --- if it were, the SQL committee would've gotten around to including it; seems they've managed to include everything but the kitchen sink already. Add to that the complete lack of any previous demand for the feature, and you have to wonder where the market is. The fact that several different groups have been mentioned to be working on this feature would indicate that it is worth considering. It looks to me more like someone published a paper that caught the attention of a few profs looking for term projects for their students. Now maybe it really is the best idea since sliced bread and will be seen in the next SQL spec edition, but color me
Re: [HACKERS] Proposal: Commit timestamp
I would be *very* concerned that system time is not a guaranteed monotonic entity. Surely a counter or other internally managed mechanism would be a better solution. Furthermore, what would be the ramifications of master and slave system times being out of sync? Finally what if system time is rolled forward a few minutes as part of a correction and there were transactions completed in that time? There is a change, albeit small, that two transactions will have the same timestamp. More importantly, this will throw all kinds of issues in when the slave sees transactions in the future. Even with regular NTP syncs, drift can cause a clock to be rolled forward a few milliseconds, possibly resulting in duplicate transaction IDs. In summary, I don't think the use of system time has any place in PostgreSQL's internal consistency mechanisms, it is too unreliable an environment property. Why can't a counter be used for this instead? - Naz. Jan Wieck wrote: For a future multimaster replication system, I will need a couple of features in the PostgreSQL server itself. I will submit separate proposals per feature so that discussions can be kept focused on one feature per thread. For conflict resolution purposes in an asynchronous multimaster system, the last update definition often comes into play. For this to work, the system must provide a monotonically increasing timestamp taken at the commit of a transaction. During replication, the replication process must be able to provide the remote nodes timestamp so that the replicated data will be as of the time it was written on the remote node, and not the current local time of the replica, which is by definition of asynchronous later. To provide this data, I would like to add another log directory, pg_tslog. The files in this directory will be similar to the clog, but contain arrays of timestamptz values. On commit, the current system time will be taken. As long as this time is lower or equal to the last taken time in this PostgreSQL instance, the value will be increased by one microsecond. The resulting time will be added to the commit WAL record and written into the pg_tslog file. If a per database configurable tslog_priority is given, the timestamp will be truncated to milliseconds and the increment logic is done on milliseconds. The priority is added to the timestamp. This guarantees that no two timestamps for commits will ever be exactly identical, even across different servers. The COMMIT syntax will get extended to COMMIT [TRANSACTION] [WITH TIMESTAMP timestamptz]; The extension is limited to superusers and will override the normally generated commit timestamp. This will be used to give the replicating transaction on the replica the exact same timestamp it got on the originating master node. The pg_tslog segments will be purged like the clog segments, after all transactions belonging to them have been stamped frozen. A frozen xid by definition has a timestamp of epoch. To ensure a system using this timestamp feature has enough time to perform its work, a new GUC variable defining an interval will prevent vacuum from freezing xid's that are younger than that. A function get_commit_timestamp(xid) returning timpstamptz will return the commit time of a transaction as recorded by this feature. Comments, changes, additions? Jan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] tsearch in core patch, for inclusion
Andrew Dunstan wrote: I am constantly running into this: Q. Does PostgreSQL have full text indexing? A. Yes it is in contrib. Q. But that isn't part of core. A. *sigh* Where on the website can I see what plugins are included with PostgreSQL? Where on the website can I see the Official PostgreSQL Documentation for Full Text Indexing? With TSearch2 in core will that fix the many upgrade problems associated with using TSearch2? contrib is a horrible misnomer. Can we maybe bite the bullet and call it something else? After years of PG use, I am still afraid to use contrib modules because it just *feels* like voodoo. I have spent much time reading this mailing list and on IRC with PG users, and I know that contrib modules are on the whole tested and safe, but the lack of web documentation and any indication of what they do other than check the notes that come with the source makes me just feel like they are use and cross fingers type thing. I don't know how hard it would be to implement, but perhaps contrib modules could be compiled in a similar way to Apache modules. E.g., ./configure --with-modulename with the onus for packaging them appropriately falling onto the shoulders of the module authors. I feel that even a basic module management system like this would greatly increase awareness of and confidence in the contrib modules. Oh, and +1 on renaming contrib +1 on the need for a comprehensive list of them +1 on the need for more doc on the website about each of them, onus falling on module authors, perhaps require at least a basic doc patch as a requirement for /contrib inclusion. - Naz ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] dump / restore functionality
At risk of being chastised for reviving old issues, I was wondering, what are the chances were of getting the dump / restore selectivity into 8.2 ? I am referring to the idea that, instead of the current 2 parts, a dump could be broken up into 3 parts, namely tables, data and everything else, so that data from one dump could be mixed and matched with schema defs from another dump easily and scriptably. I think the previous discussion concluded that the functionality would be best implemented as a selective restore, rather than a breakable dump due to the risk of inconsistent restores, so you could restore just the tables, data or everything else components from a given dump. Did this item make it onto the to-do list? If so, did anyone pick this up or will I be waiting until a future as-yet-undefined date? More generally, is there a publicly accessible place one can see the to-do items, who has adopted which ones and what the status is on them? Sorry for asking this, but I am still a rather new participant in here. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] dump / restore functionality
Zero, because feature freeze is over. Aah yes, fair enough If you find this feature interesting, you are free to drive the development yourself, independent of it appearing on any list. To avoid tears later on, look for a consensus about the merit of the feature first, though This has been discussed already, and there was a not insignificant amount of support from it, IIRC Tom Lane agreed that such functionality would be useful. Tom, are you aware if this item made it onto the to-do list? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] dump / restore functionality
None, but feel free to start coding for 8.3.My coding skills are still nascent, but I shall do my best. My coding skills are still pretty nascent, but I shall do my best. That seems like a rather spectacular overstatement of the likely benefits, not to mention a misdescription of what was discussed. Once again I get pulled over by the semantics police :) Yes, you are right, that's what was discussed, and that is the functionality I am hoping for, as it would allow scripting the merging of a schema from one database with the table data from another. Did this make it into the to-do list for 8.3 ? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Timezone List
Any chance for a DB Client accessible list of allowable time zones? I've been told that the only way to get at this list is by looking through the source and lifting the list from zone.tab. While I'm at it, how about an accessible list of country codes? I know that it's not core db functionality, but these lists are so universally useful that making users parse the files and store them in tables seems silly. What are other people's thoughts on this? Good idea or not? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Timezone List
Martijn van Oosterhout wrote: In the CVS version there is a table with this information: http://developer.postgresql.org/pgdocs/postgres/view-pg-timezonenames.html Great, thanks for that Err, where does postgres use this information? I beleive there is a project on pgfoundary that has some standard datasets. Currently, it is stored in /src/timezone/data/iso3166.tab and I propose to have it available in a system view or something similar. This data is as useful as the available timezones, although I concede that it is not part of PG functionality and this may be more appropriate as a simple file that can be psql -f'd into the database if users need it as part of an app. It's more developer helper data than database functionlity and hence it could be more appropriate to distribute through the support community rather than as part of the postgresql core. Comments? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Timezone List
Actually, what that view gives you is timezone offset abbreviations, not the full zone names that you could use with SET TIME ZONE. It strikes me that we should have a view for that as well. We could use code similar to scan_available_timezones() to generate the view output. It's somewhat urgent to address this now, because pg_timezonenames is sitting on the obvious name for such a view, and once we release 8.2 we won't be able to change it. On reflection I think the existing view is wrongly named --- perhaps it should be pg_timezoneabbrevs? Or more readably, perhaps pg_timezone_abbrevs, with pg_timezone_names for the other view. regards, tom lane I agree with having two views, and I also think that the name as it is, is not right. I agree with pg_timezone_abbrevs and pg_timezone_names or similar. On a related note, there is not a one:one relationship between abbreviations and zone names, some abbreviations are used by two zones (forex EST, CST and others are used in Australia and the Americas) and currently it is a server configuration directive (australian_timezones) to assume Australian or American zones in the case of ambiguity. I don't know about anyone else, but the whole australian_timezones thing seems like an ugly hackaround to me. I do not have a proposed solution to this, but I see a non-trivial risk of an application being re-deployed on a server where the admin forgets to change this directive resulting in all kinds of fun and games. Forgive me if this is an already-discussed issue. I am also rather baffled at the way SAT is changed from being interpreted as a day of the week in one mode, and a timezone in another. This seems an awful incongruity of behavior, and SAT should be interpreted as a timezone in both modes. If it must be done, switching of this behavior doesn't fit in with the purpose of the australian_timezones directive and should be made the subject of a different directive (e.g., sat_is_timezone(boolean) or something similar). SAT should, IMHO, always be considered a timezone and use of the SAT string by DB programmers should be just another case for care as with any other SQL keyword. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] PostgreSQL on 64 bit Linux
I have a PostgreSQL installation on a Debian box that had the 64bit SMP kernel installed before PostgreSQL was compiled and installed on it. Does PostgreSQL take any advantage of the 64 bit environment or have we not done anything to move into the 64 bit world yet? Regards, - Naz ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PostgreSQL on 64 bit Linux
Douglas McNaught wrote: Naz Gassiep [EMAIL PROTECTED] writes: I have a PostgreSQL installation on a Debian box that had the 64bit SMP kernel installed before PostgreSQL was compiled and installed on it. Does PostgreSQL take any advantage of the 64 bit environment or have we not done anything to move into the 64 bit world yet? Depends on whether PG was compiled as 64-bit or 32-bit--is your toolchain 64-bit all the way, or is it just the kernel? -Doug I just compiled as the manual says. I guess I must have compiled it in 32. I'll recompile in 64 when I upgrade to 8.2 when it's out. Thanks, - Naz.
Re: [HACKERS] pg_dump schema breakup
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Well, the other issue is how many canned breakup schemes we are going to support. If this particular one is of sufficiently general usefulness then I have no objection. But when you can produce it trivially from the output of "pg_dump -s", the need to hardcode it hardly seems pressing. FWIW, I am in favor of providing a way to break up the dump output like this, I was merely objecting to the vocabulary ;-). We have certainly seen tons of people burnt by the performance problems inherent in separate-data-and-schema restores, and splitting the dump into three parts instead of two seems like it would fix that. But I also like Alvaro's comment that this should be on the restore side not so much the dump side. If you do two or three successive pg_dump runs to make your dump then you run a nontrivial risk of not getting consistent dumps. My advice to people would be to do *one* full "pg_dump -Fc" and then extract three scripts out of that. The question then is whether it's worth providing the extraction functionality in a more canned, user-friendly form than "here, hack up the -L output with this perl script". I'd vote yes. regards, tom lane I greatly appreciate the comments here and am glad that my initial idea has support. This thread highlights to me the difference between the "hey there's a good idea there despite the fact that's he's obviously not a veteran software developer" culture that the PostgreSQL community has instead of the "he is obviously not a veteran software developer so what on Earth could he have to offer us" responses I've had from various other open source projects. On a less obsequious note, I agree that pg_dump should be used to dump everything in a single run to avoid consistency issues, and the selection of data to be restored should be done with pg_restore. As this is a feature that I would benefit greatly from, how do I go about ensuring that this idea finds its way to the appropriate developer and doesn't get forgotten in the mountain of ideas in the "that'd be nice to have some day" category? - Naz
[HACKERS] pg_dump schema breakup
This is my first post to a PostgreSQL mailing list, so please forgive me if I have posted to the wrong place Currently pg_dump has flags for dumping only table definitions and/or data. These flags are respectively: --schema-only --data-only I propose that two more be added: --tables-only --constraints-only These would essentially break up the output of --schema-only into two sections. I.e., the output of --tables-only plus the output of --constraints-only would be identical to the output of --schema-only . There are a number of scenarios where this may be useful, I will describe the one that I would use it for. When making changes to my database schema I would like to take the schema from the newly modified database, and dump the data from the old database into it to ensure that the new schema is able to take the data that exists in the live database. If it isn't then I modify the new schema or the live dataset as appropriate, and try again. This requires the following steps: 1. Create temporary database and apply modified schema to it 2. Dump new database using --schema-only 3. Split new schema into table definitions and constraints 4. Apply new schema table definitions from step 34 to the testing database 5. Dump the existing database using --data-only 6. Apply the dataset from step 5 to the testing database 7. Apply new schema constraints from step 3 to the testing database All of these steps are easily scriptable except step 3, which means that making quick changes to the new schema and re-applying includes the tedious task of opening the 5,000 line schema file in a text editor and manually pasting the relevant sections into new files. Step 3 really does hold up the development process with regards to testing changes to the schema. Generalizing the nature of this task, the pg_dump features I propose would allow the easy scripting of dumping databases, making changes to the dumps and then re-applying them in a non paradox-inducing order. FWIW I thought this would be a very simple patch, and had a look at the code for pg_dump myself, despite the fact that I've not even written a Hello World in C as yet. That attempt failed miserably, so now I am reduced to selling the merits of this idea to real developers. Incidentally, --schema-only appears to break tables and constraints into two sections already, as evidenced by the fact that all tables are created first, and constraints afterwards. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_dump schema breakup
Tom Lane wrote: Naz Gassiep [EMAIL PROTECTED] writes: I propose that two more be added: --tables-only --constraints-only This doesn't seem well-defined at all. There are many objects in a database that are definitely neither tables nor constraints, and it's not very clear what things should be considered constraints either. I think what you may really be after is "the stuff that should be loaded before inserting data" and "the stuff that should be loaded after", but the above are poor names for these concepts. regards, tom lane And here I was thinking that I'd been clear :) But yes, you are right, what I want is "the stuff that gets loaded before data insertion" and "the stuff that goes after data insertion" Essentially, the CREATE TABLE statements are the first part as that is what is needed for data to be inserted. Everything else is the second part.
Re: [HACKERS] pg_dump schema breakup
Andreas Joseph Krogh wrote: On Friday 18 August 2006 18:52, Tom Lane wrote: Naz Gassiep [EMAIL PROTECTED] writes: I propose that two more be added: --tables-only --constraints-only This doesn't seem well-defined at all. There are many objects in a database that are definitely neither tables nor constraints, and it's not very clear what things should be considered constraints either. I think what you may really be after is "the stuff that should be loaded before inserting data" and "the stuff that should be loaded after", but the above are poor names for these concepts. But it certainly would be nice to be able to dump all that "stuff":-) Yea, I've been told that this would not be a high demand feature. So do I have a second vote? ;-)
Re: [HACKERS] pg_dump schema breakup
Andrew Dunstan wrote: We already have a highly selective and configurable restore mechanism, using the -L feature of pg_restore. Maybe there's a good special case for this particular split, but it is hardly undoable now. As for Naz' needs - I gave him a perl script I whipped up in few minutes to do the split he wanted - and I'm sure you could do the same in python ;-) cheers andrew And very appreciative I am too. While I concede it is doable now, and I could do this in Python as well, I do feel that a feature like this would have wider use than may be apparent on a first glance. Furthermore, I think that the simplicity of implementing it makes it really a question of why not rather than why. - Naz ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org