Re: [HACKERS] [COMMITTERS] pgsql: Fix PGXS conventions so that extensions can be built against
Dear Robert, Fix PGXS conventions so that extensions can be built against Postgres installations whose pg_config program does not appear first in the PATH. Per gripe from Eddie Stanley and subsequent discussions with Fabien Coelho and others. Is there any chance of this being backpatched? I just spent a few hours tracking down a problem with compiling a 3rd party module against an 8.2 installation installed seperatly from my systems packages install. (Ie. i didnt move it, but there was an additional pg_config on the system pointing to the wrong/other place). I'm not exactly sure how pgxs ever worked on systems with multiple postgres's installed, Simple : the target pg_config MUST be ahead in the PATH. -- Fabien. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL
"Pavel Stehule" <[EMAIL PROTECTED]> writes: > Global temp table can be created from template only when is used. It's > has not negative efect on app which doesn't use it. The benefit of > g.t.t. is simplifycation of stored procedures. And if it's used in 200 txns/s? Imagine the earlier poster who was looking for a way to display the count of records matching a search followed by the ten records on the page without re-executing the search. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Proposal: In-Place upgrade concept
I attach In-Place upgrade project concept. Any technical details about implementation of each part will be sent later (after concept acceptance). Please, let me know your comments. thanks Zdenek In-place Upgrade project --- Overview PostgreSQL community releases new version of PostgreSQL at least once a year but the released version is not able to run with old data files. For a user who wants to upgrade his database server is a nightmare to upgrade bigger database and it is impossible to upgrade a very large database. This project will try to solve this issue and provide a functionality for PostgreSQL which will cover all user's requirements. Current status -- 1) Data upgrade is supported only by exporting and importing data using pg_dump - slow and causes a long downtime - extra disk space needed - require both version (old&new) 2) pg_migrator - faster then 1) but it still has a long downtime when on disk structure has changed - previous version of PostgreSQL is needed. - no downgrade - require both version (old&new) Project Goals - 1) Minimize the database downtime. 2) Do not require extra disk space. 3) Do not require an old version of PostgreSQL to be installed. 4) Allow a fallback to original PostgreSQL version if something would go wrong. 5) Support all data types, including user defined types 6) Should be easy to use. Functional specification There might be more different approaches on how to perform upgrade. We assume that we need to convert a very big database. If we will upgrade the data files offline the database downtime might be critical. We will better try to convert data online. This will bring a requirement that new major version of PostgreSQL must understand the old data structures and it must be able to work with them. The next important requirement is an ability to fallback (downgrade) to the previous PostgreSQL version. When new major version does not work well for the user, he needs to be able to go back to the previous version with minimal downtime. This will bring a requirement that PostgreSQL should also be able to write data in the old format. However, we might expect some performance penalty. The above requirements lead to PostgreSQL running in three modes: compatibility mode (compat) - a new version is running on the old data files. It is possible to read and write data as in a previous version. Features of a new version might not be available. Fallback to the previous version is possible. upgrade mode - a conversion of data files is running in the background. PostgreSQL have to be able to work with both the new data file format and the old data file format. Features of a new version might not be available. PostgreSQL have to be able to recover and continue in the upgrade mode in case of a power/system failure. Fallback to the previous version is not possible. normal mode - all required data files were upgraded to the new format. Features of a new version are available. Fallback to the previous version is not possible. Project Roadmap - 1) Create a detailed design for each affected part 2) Design a testing environment/process 3) Implementation 4) Change development/release processes - patch author have to be responsible for providing the required upgrade operations/functions. Some more development related to upgrade might be done also after feature freeze. In-place upgrade should be available since x.y.1 version. List of affected parts -- 1) Data Page Structure 1.1) Page Layout 1.2) Tuple header and structure 1.3) Data encoding - data type changes 2) Control File 3) WAL 4) Catalog 5) Configuration files 7) Others (Procedure language, Communication protocol ...) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Still recommending daily vacuum...
Joshua D. Drake wrote: Alvaro Herrera wrote: Joshua D. Drake wrote: Did we change the default autovac parameters for 8.3 (beyond turning it on?) because on any reasonably used database, they are way to conservative. We're still on time to change them ... Any concrete proposals? I could provide numbers from production high use databases. We could probably back those down a little and make more reasonable numbers. Please do so. Perhaps others can also tell their typical settings. Best Regards Michael Paesold ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Proposal: In-Place upgrade concept
Zdenek Kotala wrote: Project Goals - ... 3) Do not require an old version of PostgreSQL to be installed. Why not? Having two versions installed at the same time doesn't seem like a problem to me. You can remove the old version as soon as the upgrade is done. 4) Allow a fallback to original PostgreSQL version if something would go wrong. ... compatibility mode (compat) - a new version is running on the old data files. It is possible to read and write data as in a previous version. Features of a new version might not be available. Fallback to the previous version is possible. That's madness. Why would you want to do that? The complexity it adds is just mind-boggling. You still wouldn't be able to downgrade after you switch from compatibility mode, and it seems likely that any problems, whatever they might be, would not appear until you switch. That means you'll need to be prepared to downgrade using a backup anyway, so the compatibility mode doesn't buy you much. upgrade mode - a conversion of data files is running in the background. PostgreSQL have to be able to work with both the new data file format and the old data file format. Features of a new version might not be available. PostgreSQL have to be able to recover and continue in the upgrade mode in case of a power/system failure. Fallback to the previous version is not possible. That's feasible, though I don't see why new features wouldn't be available. As before, upgrade can be done, it's just a matter of someone scratching the itch. pg_migrator can handle the catalog changes. Doing the page conversion from 8.2 -> 8.3 is possible, and it could be done on-the-fly inside PostgreSQL the first time a page is read in. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL
"Tom Lane" <[EMAIL PROTECTED]> writes: > Alvaro Herrera <[EMAIL PROTECTED]> writes: >> Tom Lane escribió: >>> I rather doubt that. The most likely implementation would involve >>> cloning a "template" entry into pg_class. > >> How about a new relkind which causes the table to be located in >> PGDATA/base//pg_temp_/ >> So each backend can have its own copy of the table with the same >> relfilenode; there's no need for extra catalog entries. > > Uh-huh. And what do you do with relpages, reltuples, relfrozenxid, and > pg_statistic entries? What if one backend wants to TRUNCATE or CLUSTER > its copy (requiring a new relfilenode)? Where does ALTER TABLE fit into > this? I would have suggested that when we construct the relcache entry for the table we substitute a local version of refilenode for the global one. None of those sound like hard problems. Certainly it's more invasive this way but the other way is just a hack for complying with the letter of the spec without actually making it work right. It would be silly and in many use cases useless to have regular DML operating on data which has no business being anything but backend-local generate garbage in on-disk catalog tables. I had a strange thought though. The ideal data structure for local pg_statistic data in the unlikely case that users analyze their local tables would in fact be a global temporary table as well. I wonder if we could bootstrap something similar for pg_class as well. Incidentally, for what would imagine relfozenxid would be useful for these tables anyways? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL
> Global temp table can be created from template only when is used. It's > has not negative efect on app which doesn't use it. The benefit of > g.t.t. is simplifycation of stored procedures. And if it's used in 200 txns/s? Imagine the earlier poster who was looking for a way to display the count of records matching a search followed by the ten records on the page without re-executing the search. I wrote about comparation global temp tables and current temp tables. Counting of result's records is problem. I know. It's incompleteness of current cursor's implementation. Every cursor can be materialised and then can be counted. We need operation OPEN which matarialise cursor and returns real row_count. Regards Pavel ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Proposal: In-Place upgrade concept
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > As before, upgrade can be done, it's just a matter of someone scratching the > itch. pg_migrator can handle the catalog changes. Doing the page conversion > from 8.2 -> 8.3 is possible, and it could be done on-the-fly inside PostgreSQL > the first time a page is read in. I was previously thinking a convertor for the packed varlena change wouldn't be necessary since it handles things just fine when it finds a 4-byte header where a 1-byte header might have been used. I just realized that's not true. All varlena headers would have to be shifted two bits to the left (on little-endian machines) and have their toast bits fiddled even if we don't bother converting them to the shrink their size. Externally toasted varlenas would however necessarily change size because they must use the new format. This is actually a bit of a problem. We would need to know when we read in a page what the tupledescriptor for that relation looks like to know which fields are varlena. I'm not sure how easy it would be to arrange for the tuple descriptor to be passed down that far. Conceivably we could grab another infomask bit to indicate "uses new-style varlenas" and then have heaptuple.c understand how to convert them in place. But that leads to a ton of memory management or page locking problems. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal: In-Place upgrade concept
Heikki Linnakangas wrote: Zdenek Kotala wrote: Project Goals - ... 3) Do not require an old version of PostgreSQL to be installed. Why not? Having two versions installed at the same time doesn't seem like a problem to me. You can remove the old version as soon as the upgrade is done. This should be problem if you want to upgrade operation system together with DB and new operation system does not have this version. This requirement is based on operation system vendor and also on my experience with Oracle upgrade on tru64 cluster when two version of installed Oracle does not works good - ok postgresql is better in this way :-). 4) Allow a fallback to original PostgreSQL version if something would go wrong. ... compatibility mode (compat) - a new version is running on the old data files. It is possible to read and write data as in a previous version. Features of a new version might not be available. Fallback to the previous version is possible. That's madness. Why would you want to do that? The complexity it adds is just mind-boggling. I don't think that it increase complexity much. You will have conversion function to convert page/data from version x->x+1 and the same way could be use to convert data back. It need more investigation, but I'm not afraid about complexity (meanwhile :-). You still wouldn't be able to downgrade after you switch from compatibility mode, and it seems likely that any problems, whatever they might be, would not appear until you switch. That means you'll need to be prepared to downgrade using a backup anyway, so the compatibility mode doesn't buy you much. One kind of problem is user defined type/function and extension. You can test a lot of things on test machines, but some problem could occur on big tables/storage. No many postgresql users have multiterrabyte disk array for testing. Second kind of problem is operation system upgrade. If you upgrade OS and postgresql together and something fails (not only new version of postgresql, but many other things) then admin want to revert back to the old OS version with old postgres version. In Solaris it is called Live Upgrade feature. This feature enable possibility to do it easily. However, backup/restore is possible solution - unfortunately not much comfortable. I think it is nice to have requirement and if it will be possible we can do that, if it will bring a lot of complication, we can remove it. upgrade mode - a conversion of data files is running in the background. PostgreSQL have to be able to work with both the new data file format and the old data file format. Features of a new version might not be available. PostgreSQL have to be able to recover and continue in the upgrade mode in case of a power/system failure. Fallback to the previous version is not possible. That's feasible, though I don't see why new features wouldn't be available. For example Oracle 10 implemented some new incremental backup feature. But you can use it only when you have converted database to correct format. I'm not oracle and I'm not able to predict which new feature in postgresql will require all updated pages for example. I don't expect that database upgrading mode will affect a lot of new features but we must have it in the mind. Zdenek ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: In-Place upgrade concept
Gregory Stark wrote: "Heikki Linnakangas" <[EMAIL PROTECTED]> writes: As before, upgrade can be done, it's just a matter of someone scratching the itch. pg_migrator can handle the catalog changes. Doing the page conversion from 8.2 -> 8.3 is possible, and it could be done on-the-fly inside PostgreSQL the first time a page is read in. I was previously thinking a convertor for the packed varlena change wouldn't be necessary since it handles things just fine when it finds a 4-byte header where a 1-byte header might have been used. I just realized that's not true. All varlena headers would have to be shifted two bits to the left (on little-endian machines) and have their toast bits fiddled even if we don't bother converting them to the shrink their size. Externally toasted varlenas would however necessarily change size because they must use the new format. This is actually a bit of a problem. We would need to know when we read in a page what the tupledescriptor for that relation looks like to know which fields are varlena. I'm not sure how easy it would be to arrange for the tuple descriptor to be passed down that far. Speaking of on-the-fly upgrading, ReadBuffer is already passed the Relation, which contains the TupleDesc, so I don't think that's a problem. Not sure how easy that would be to do in an external program like pg_migrator. Conceivably we could grab another infomask bit to indicate "uses new-style varlenas" and then have heaptuple.c understand how to convert them in place. But that leads to a ton of memory management or page locking problems. My thinking is that when a page in the old format is read in, it's converted to the new format before doing anything else with it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: In-Place upgrade concept
Heikki Linnakangas wrote: Gregory Stark wrote: Conceivably we could grab another infomask bit to indicate "uses new-style varlenas" and then have heaptuple.c understand how to convert them in place. But that leads to a ton of memory management or page locking problems. My thinking is that when a page in the old format is read in, it's converted to the new format before doing anything else with it. Yes, I agree with Heikki. Any other place for conversion will generate a big complexity of maintanace. Zdenek ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Proposal: In-Place upgrade concept
On Tue, Jul 03, 2007 at 12:05:07PM +0100, Heikki Linnakangas wrote: > >This is actually a bit of a problem. We would need to know when we > >read in a page what the tupledescriptor for that relation looks like > >to know which fields are varlena. I'm not sure how easy it would be > >to arrange for the tuple descriptor to be passed down that far. > > Speaking of on-the-fly upgrading, ReadBuffer is already passed the > Relation, which contains the TupleDesc, so I don't think that's a > problem. Not sure how easy that would be to do in an external program > like pg_migrator. My reading of this thread so far is that we're intending to upgrade pages on the fly? Which means that at any point in time, some pages will be converted and some not, so the tuple descriptor isn't going to help, surely you need some identifier on the page telling you if it's upgraded or not? Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Proposal: In-Place upgrade concept
Martijn van Oosterhout wrote: On Tue, Jul 03, 2007 at 12:05:07PM +0100, Heikki Linnakangas wrote: This is actually a bit of a problem. We would need to know when we read in a page what the tupledescriptor for that relation looks like to know which fields are varlena. I'm not sure how easy it would be to arrange for the tuple descriptor to be passed down that far. Speaking of on-the-fly upgrading, ReadBuffer is already passed the Relation, which contains the TupleDesc, so I don't think that's a problem. Not sure how easy that would be to do in an external program like pg_migrator. My reading of this thread so far is that we're intending to upgrade pages on the fly? Which means that at any point in time, some pages will be converted and some not, so the tuple descriptor isn't going to help, surely you need some identifier on the page telling you if it's upgraded or not? Yes, we already have such an identifier. The page layout version number is stored on every page. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Proposal: In-Place upgrade concept
Martijn van Oosterhout wrote: On Tue, Jul 03, 2007 at 12:05:07PM +0100, Heikki Linnakangas wrote: This is actually a bit of a problem. We would need to know when we read in a page what the tupledescriptor for that relation looks like to know which fields are varlena. I'm not sure how easy it would be to arrange for the tuple descriptor to be passed down that far. Speaking of on-the-fly upgrading, ReadBuffer is already passed the Relation, which contains the TupleDesc, so I don't think that's a problem. Not sure how easy that would be to do in an external program like pg_migrator. My reading of this thread so far is that we're intending to upgrade pages on the fly? Which means that at any point in time, some pages will be converted and some not, so the tuple descriptor isn't going to help, surely you need some identifier on the page telling you if it's upgraded or not? Currently there is page layout version number. But this information is not useful for conversion from 8.1 or 8.2, because both version has same number, but some datatypes (inet/cidr) have different storage format. Small discussion about page version is there http://archives.postgresql.org/pgsql-hackers/2007-06/msg00745.php The page identification is one thing what must been improved. Zdenek ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Postgresql.conf cleanup
Am Montag, 2. Juli 2007 13:03 schrieb Josh Berkus: > (change requires restart): this phrase appears over 20 times in the > notes. This is enough times to be really repetitive and take up a lot > of scrolling space, while not actually covering all startup-time > parameters. Which ones are missing? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL
Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Tom Lane escribi?: > >> I rather doubt that. The most likely implementation would involve > >> cloning a "template" entry into pg_class. > > > How about a new relkind which causes the table to be located in > > PGDATA/base//pg_temp_/ > > So each backend can have its own copy of the table with the same > > relfilenode; there's no need for extra catalog entries. > > Uh-huh. And what do you do with relpages, reltuples, relfrozenxid, and > pg_statistic entries? What if one backend wants to TRUNCATE or CLUSTER > its copy (requiring a new relfilenode)? Where does ALTER TABLE fit into > this? And what is the use-case for this functionality? What does it give us that we don't already have? -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] todo: Hash index creation
Wow... not sure how I missed that. I *did* create this schema ages ago, perhaps it wasn't there, or at the time I had no idea what the implications were. *shrug* Regards, - Naz. Tom Lane wrote: Naz Gassiep <[EMAIL PROTECTED]> writes: As a result, when creating tables containing large blocks of text I wish to index, I've been using HASH as an index method. Please can we state in the manual that HASH index types are in a beta stage of development or something similar, or perhaps remove the manual entry altogether until HASH is at a point where it is usable in production. Uh, the manual already does say Note: Testing has shown PostgreSQL's hash indexes to perform no better than B-tree indexes, and the index size and build time for hash indexes is much worse. Furthermore, hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. For these reasons, hash index use is presently discouraged. under 11.2 Index Types, as well as various derogatory remarks elsewhere. regards, tom lane
[HACKERS] how to "pg_dump", based in select command
Hello folks. I want to know how to dump some parts of a database? I need to extract the records in a "select * from table" and the pg_dump or other tool will create a file with the records found in this "select command", and after this, i will restore this file in another database with the same structure. Any idea? Thanks people. Jeferson Kasper
Re: [HACKERS] Proposal: In-Place upgrade concept
Tom Lane wrote: Heikki Linnakangas <[EMAIL PROTECTED]> writes: My thinking is that when a page in the old format is read in, it's converted to the new format before doing anything else with it. Yeah, I'm with Heikki on this. What I see as a sane project definition is: * pg_migrator or equivalent to convert the system catalogs * a hook in ReadBuffer to allow a data page conversion procedure to be applied, on the basis of checking for old page layout version. pg_migrator is separate tool which requires old postgres version and I would like to have solution in postgres binary without old version presence. Very often new postgres version is store in same location (e.g. /usr/bin) and normal users could have a problem. I see there three possible solution: 1) special postgres startup mode - postgres --upgrade-catalog 2) automatic conversion - when postgres convert catalog automatically on first startup on old db cluster 3) (in compat mode) catalog will be converted on fly (read/write), until upgrade mode is not start > I think insisting on a downgrade option is an absolutely certain way > of guaranteeing that the project will fail. How I mentioned before. This is nice to have requirement. I would like to have in the mind and when it starts complexity explosion we can remove it from the requirement list. I'm not sure it's feasible to expect that we can change representations of user-defined types, either. I don't see how you would do that without catalog access (to look up the UDT), and the page conversion procedure is going to have to be able to operate without catalog accesses. (Thought experiment: a page is read in during crash recovery or PITR slave operation, and discovered to have the old format.) The idea how to solve problem in data type on disk representation change is to keep old and new datatype in/out function. New created tables will contains new type implementation and old tables could be converted with ALTER TABLE command on user request. Old data type could be store in compat library. BTW, I thought of a likely upgrade problem that we haven't discussed (AFAIR) in any of the many threads on this subject. What about an index access method change that involves an index-wide restructuring, such that it can't be done one page at a time? A plausible example is changing hash indexes to have multiple buckets per page. Presumably you can fix the index with REINDEX, but that doesn't meet the goal of limited downtime, if the index is big. Is there another way? Yes, there is way to keep old and new implementation of index and each will have different oid. Primary key for pg_am table will be name+pg_version - It is similar to UDT solution. CREATE INDEX as a REINDEX will use actual implementation. Zdenek ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] how to "pg_dump", based in select command
On 7/3/07, Jeferson Kasper <[EMAIL PROTECTED]> wrote: Hello folks. I want to know how to dump some parts of a database? I need to extract the records in a "select * from table" and the pg_dump or other tool will create a file with the records found in this "select command", and after this, i will restore this file in another database with the same structure. Any idea? Thanks people. Jeferson Kasper In 8.2 you can use the COPY command with a query, as in COPY (SELECT ...) TO 'filename' In earlier versions, one alternative is to create a table filled with the results of your query and pg_dump it. Alternatively you might also pipe the query into psql and pipe the output to a file, use psql's \o option, etc. -Josh ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal: In-Place upgrade concept
On Tue, Jul 03, 2007 at 11:36:03AM -0400, Tom Lane wrote: > I'm not sure it's feasible to expect that we can change representations > of user-defined types, either. I don't see how you would do that > without catalog access (to look up the UDT), and the page conversion > procedure is going to have to be able to operate without catalog > accesses. (Thought experiment: a page is read in during crash recovery > or PITR slave operation, and discovered to have the old format.) Well, there are two types of conversions: 1. Simple byte rearrangement. If it's not too many you could simply build them into pg_migrator. Doesn't help with user-defined types, but maybe you allow plugins to define a seperate hook whose only purpose is to upgrade the value (without catalog access...). 2. Otherwise you could do a VACUUM over the table to touch every page, thus solving it. Dunno what to do about crashing at this point. Hmm, actually, what's the problem with PITR restoring a page in the old format. As long as it's clear it's the old format it'll get fixed when the page is actually used. > BTW, I thought of a likely upgrade problem that we haven't discussed > (AFAIR) in any of the many threads on this subject. What about an index > access method change that involves an index-wide restructuring, such > that it can't be done one page at a time? A plausible example is > changing hash indexes to have multiple buckets per page. Presumably > you can fix the index with REINDEX, but that doesn't meet the goal of > limited downtime, if the index is big. Is there another way? Well, we have concurrent index builds these days. I certainly don't have ideas on how to fix this, especially if the index is on a datatype that has changed format... I suppose those indexes will just have to be rebuilt (a REINDEX will upgrade every page in the table anyway...). I think it'd still be cheaper than dump/restore. Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Proposal: In-Place upgrade concept
On Tue, Jul 03, 2007 at 07:13:29PM +0200, Zdenek Kotala wrote: > pg_migrator is separate tool which requires old postgres version and I > would like to have solution in postgres binary without old version > presence. Very often new postgres version is store in same location > (e.g. /usr/bin) and normal users could have a problem. Possibly. But you have to demonstrate it actually works and it's best to do that with a seperate process. Or fix pg_migrator to not require the old version, either way will do. > > I think insisting on a downgrade option is an absolutely certain way > > of guaranteeing that the project will fail. > > How I mentioned before. This is nice to have requirement. I would like > to have in the mind and when it starts complexity explosion we can > remove it from the requirement list. It seems to me that if every step of the process is WAL logged, then downgrading is simply a matter of restoring and using PITR. Downgrades are hard work, primarily because in the long run it's going to be totally untested code because hardly anyone is going to need it. > Yes, there is way to keep old and new implementation of index and each > will have different oid. Primary key for pg_am table will be > name+pg_version - It is similar to UDT solution. CREATE INDEX as a > REINDEX will use actual implementation. Bad idea. Indexes are already complicated peices of code, there's no guarentee the old code will even work with the new version. Seems like too much work considering REINDEX will simply fix the problem outright. Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Proposal: In-Place upgrade concept
Zdenek Kotala <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Yeah, I'm with Heikki on this. What I see as a sane project definition >> is: >> >> * pg_migrator or equivalent to convert the system catalogs >> * a hook in ReadBuffer to allow a data page conversion procedure to >> be applied, on the basis of checking for old page layout version. > pg_migrator is separate tool which requires old postgres version and I > would like to have solution in postgres binary without old version > presence. Very often new postgres version is store in same location > (e.g. /usr/bin) and normal users could have a problem. Again, you are setting yourself up for complete failure if you insist on having every possible nicety in the first version. An incremental approach is far more likely to succeed than a "big bang". I don't see a strong need to have a solution in-the-binary at all. I would envision that packagers of, say, 8.4 would include a minimal 8.3 build under an old/ subdirectory, and pg_migrator or a similar tool could invoke the old postmaster from there to do the catalog dumping. (In an RPM or similar environment, the user could even "rpm -e postgresql-upgrade" to get rid of the deadwood after completing the upgrade, whereas with an integrated binary you're stuck carrying around a lot of one-time-use code.) This strikes me as approximately a thousand percent more maintainable than trying to have a single set of code coping with multiple catalog representations. Also it scales easily to supporting more than one back version, whereas doing the same inside one binary will not scale at all. Keep in mind that if your proposal involves any serious limitation on the developers' freedom to refactor internal backend APIs or change catalog representations around, it *will be rejected*. Do not have any illusions on that point. It'll be a tough enough sell freezing on-disk representations for user data. Demanding the internal ability to read old catalog versions would be a large and ongoing drag on development; I do not think we'll hold still for it. (To point out just one of many problems, it'd largely destroy the C-struct-overlay technique for reading catalogs.) regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Proposal: In-Place upgrade concept
Martijn van Oosterhout wrote: On Tue, Jul 03, 2007 at 07:13:29PM +0200, Zdenek Kotala wrote: pg_migrator is separate tool which requires old postgres version and I would like to have solution in postgres binary without old version presence. Very often new postgres version is store in same location (e.g. /usr/bin) and normal users could have a problem. Possibly. But you have to demonstrate it actually works and it's best to do that with a seperate process. Or fix pg_migrator to not require the old version, either way will do. Pg_migrator use pg_dump for dump catalog and main concept is based on presence of old postgres version. I think Fix it means rewrite it. I think insisting on a downgrade option is an absolutely certain way of guaranteeing that the project will fail. How I mentioned before. This is nice to have requirement. I would like to have in the mind and when it starts complexity explosion we can remove it from the requirement list. It seems to me that if every step of the process is WAL logged, then downgrading is simply a matter of restoring and using PITR. Downgrades are hard work, primarily because in the long run it's going to be totally untested code because hardly anyone is going to need it. It is not downgrade. It is about keep old structure until user says convert to the new data structure. Yes, there is way to keep old and new implementation of index and each will have different oid. Primary key for pg_am table will be name+pg_version - It is similar to UDT solution. CREATE INDEX as a REINDEX will use actual implementation. Bad idea. Indexes are already complicated peices of code, there's no guarentee the old code will even work with the new version. Seems like too much work considering REINDEX will simply fix the problem outright. I do not expect that old code will work with new index structure. I want to keep both implementation and old index will be processed by old code and new one will be processed by new implementation. Each will have different OID and pg_class.relam will point to correct implementation. These obsolete index method could be located in separate compat lib. Zdenek ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposal: In-Place upgrade concept
Zdenek Kotala <[EMAIL PROTECTED]> writes: > It is not downgrade. It is about keep old structure until user says > convert to the new data structure. As Martijn already pointed out, the odds of problems surfacing only after that conversion starts seem high enough to render the whole idea a bit pointless. IMHO it's not worth the enormous development costs it will add ... and it's *certainly* unwise to tie success of the entire in-place-upgrade project to that one feature. The attractive point about pg_migrator plus page-at-a-time data upgrade is that it'd solve 90% of the problem with 10% of the work. If you get that going, and people get accustomed to working with the development restrictions associated with data upgradability, then you might be able to come back and make a case for catalog upgradability and/or downgradability in some future version. But right now you're asking people to do 90% of the work before having anything at all. > I do not expect that old code will work with new index structure. I want > to keep both implementation and old index will be processed by old code > and new one will be processed by new implementation. Each will have > different OID and pg_class.relam will point to correct implementation. I don't think it's quite that easy when you consider user-defined datatypes. Where are you going to get two sets of opclasses from? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal: In-Place upgrade concept
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > My thinking is that when a page in the old format is read in, it's > converted to the new format before doing anything else with it. Yeah, I'm with Heikki on this. What I see as a sane project definition is: * pg_migrator or equivalent to convert the system catalogs * a hook in ReadBuffer to allow a data page conversion procedure to be applied, on the basis of checking for old page layout version. I think insisting on a downgrade option is an absolutely certain way of guaranteeing that the project will fail. I'm not sure it's feasible to expect that we can change representations of user-defined types, either. I don't see how you would do that without catalog access (to look up the UDT), and the page conversion procedure is going to have to be able to operate without catalog accesses. (Thought experiment: a page is read in during crash recovery or PITR slave operation, and discovered to have the old format.) BTW, I thought of a likely upgrade problem that we haven't discussed (AFAIR) in any of the many threads on this subject. What about an index access method change that involves an index-wide restructuring, such that it can't be done one page at a time? A plausible example is changing hash indexes to have multiple buckets per page. Presumably you can fix the index with REINDEX, but that doesn't meet the goal of limited downtime, if the index is big. Is there another way? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: In-Place upgrade concept
Tom Lane wrote: Again, you are setting yourself up for complete failure if you insist on having every possible nicety in the first version. An incremental approach is far more likely to succeed than a "big bang". Yes, I know. I don't want to solve everything in one patch. I just looking forward and I'm trying to summarize complexity of problem. I will not want to stay on dead road and say ups after couple of months of development ... Zdenek ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposal: In-Place upgrade concept
Tom Lane wrote: Keep in mind that if your proposal involves any serious limitation on the developers' freedom to refactor internal backend APIs or change catalog representations around, it *will be rejected*. Do not have any illusions on that point. It'll be a tough enough sell freezing on-disk representations for user data. Demanding the internal ability to read old catalog versions would be a large and ongoing drag on development; I do not think we'll hold still for it. (To point out just one of many problems, it'd largely destroy the C-struct-overlay technique for reading catalogs.) One thing no-one's mentioned is how we're going to deal with definitive incompatibilities. Examples: - Tightening of UTF8 code. Means some text from old version won't transfer. - Changing behaviour of greatest() - recently discussed. Might invalidate views/application queries. It's the second example that I can see biting, the UTF stuff is big enough that it'll be noticed. It'd be all too easy to have a change in some inet-addr function that you don't notice your app is using. I can't think of any way of definitively auditing what features are in use (or have changed between versions). Or are these examples of changes that will only be allowed e.g. every other major version. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Proposal: In-Place upgrade concept
Martijn van Oosterhout <[EMAIL PROTECTED]> writes: > On Tue, Jul 03, 2007 at 11:36:03AM -0400, Tom Lane wrote: >> ... (Thought experiment: a page is read in during crash recovery >> or PITR slave operation, and discovered to have the old format.) > Hmm, actually, what's the problem with PITR restoring a page in the old > format. As long as it's clear it's the old format it'll get fixed when > the page is actually used. Well, what I'm concerned about is something like a WAL record providing a new-format tuple to be inserted into a page, and then you find that the page contains old-format tuples. [ thinks some more... ] Actually, so long as we are willing to posit that 1. You're only allowed to upgrade a DB that's been cleanly shut down (no replay of old-format WAL logs allowed) 2. Page format conversion is WAL-logged as a complete page replacement then AFAICS WAL-reading operations should never have to apply any updates to an old-format page; the first touch of any old page in the WAL sequence should be a page replacement that updates it to new format. This is not different from the argument why full_page_writes ensures recovery from write failures. So in principle the page-conversion stuff should always operate in a live transaction. (Which is good, because now that I think about it we couldn't emit a WAL record for the page conversion in those other contexts.) I still feel pretty twitchy about letting it do catalog access, though, because it has to operate at such a low level of the system. bufmgr.c has no business invoking anything that might do catalog access. If nothing else there are deadlock issues. On the whole I think we could define format conversions for user-defined types as "not our problem". A new version of a UDT that has an incompatible representation on disk can simply be treated as a new type with a different OID, exactly as Zdenek was suggesting for index AMs. To upgrade a database containing such a column, you install "my_udt_old.so" that services the old representation, ALTER TYPE my_udt RENAME TO my_udt_old, then install new type my_udt and start using that. Anyway that seems good enough for version 1.0 --- I don't recall that we've ever changed the on-disk representation of any contrib/ types, so how important is this scenario in the real world? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] ACM Paper relevant to our buffer algorithm
Incidentally I found this paper in ACM SIGMETRICS 1992 covering more or less precisely the same algorithm we're using for our clock sweep. I haven't quite digested it yet myself so I'm not sure what the conclusions about weights tell us to do with our buffer usage counter. I put a copy up for download since even though it's permitted to copy I don't know how to find a public link. It's kind of big so please download it and read it locally, don't try to read it from my machine: http://stark.xeocode.com/~stark/p35-nicola.pdf.gz Regarding copyright the paper sez: Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of ths Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] ACM Paper relevant to our buffer algorithm
Here are some more recent papers that also give good insight into research in this area: http://www.cs.usask.ca/~wew036/comprehensive.pdf http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-05-3.pdf -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Still recommending daily vacuum...
On Tue, Jul 03, 2007 at 11:31:08AM +0200, Michael Paesold wrote: > Joshua D. Drake wrote: > >Alvaro Herrera wrote: > >>Joshua D. Drake wrote: > >>>Did we change the default autovac parameters for 8.3 (beyond turning > >>>it on?) because on any reasonably used database, they are way to > >>>conservative. > >> > >>We're still on time to change them ... Any concrete proposals? > > > >I could provide numbers from production high use databases. We could > >probably back those down a little and make more reasonable numbers. > > Please do so. Perhaps others can also tell their typical settings. FWIW, I normally go with the 8.2 defaults, though I could see dropping vacuum_scale_factor down to 0.1 or 0.15. I also think the thresholds could be decreased further, maybe divide by 10. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpLfM5OHZFng.pgp Description: PGP signature
Re: [HACKERS] Still recommending daily vacuum...
On Mon, Jul 02, 2007 at 11:19:12PM -0400, Tom Lane wrote: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > http://developer.postgresql.org/pgdocs/postgres/routine-vacuuming.html : > > Well, with autovac defaulting to ON in 8.3, that's certainly obsolete > text now. > > Is there a reason to say anything beyond "use autovac"? There is; I know that things like web session tables aren't handled very well by autovacuum if there are any moderately large tables (anything that will take more than a few minutes to vacuum). Eventually we should be able to accommodate that case with multiple workers, but we'll need a mechanism to ensure that at least one worker doesn't get tied up in large vacuums. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpxNRAaq8BOV.pgp Description: PGP signature
Re: [HACKERS] Still recommending daily vacuum...
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > On Mon, Jul 02, 2007 at 11:19:12PM -0400, Tom Lane wrote: >> Is there a reason to say anything beyond "use autovac"? > There is; I know that things like web session tables aren't handled very > well by autovacuum if there are any moderately large tables (anything > that will take more than a few minutes to vacuum). Eventually we should > be able to accommodate that case with multiple workers, but we'll need a > mechanism to ensure that at least one worker doesn't get tied up in > large vacuums. And which part of that do you think isn't resolved in 8.3? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Still recommending daily vacuum...
Jim C. Nasby wrote: > On Tue, Jul 03, 2007 at 11:31:08AM +0200, Michael Paesold wrote: > > Joshua D. Drake wrote: > > >Alvaro Herrera wrote: > > >>Joshua D. Drake wrote: > > >>>Did we change the default autovac parameters for 8.3 (beyond turning > > >>>it on?) because on any reasonably used database, they are way to > > >>>conservative. > > >> > > >>We're still on time to change them ... Any concrete proposals? > > > > > >I could provide numbers from production high use databases. We could > > >probably back those down a little and make more reasonable numbers. > > > > Please do so. Perhaps others can also tell their typical settings. > > FWIW, I normally go with the 8.2 defaults, though I could see dropping > vacuum_scale_factor down to 0.1 or 0.15. I also think the thresholds > could be decreased further, maybe divide by 10. How about pushing thresholds all the way down to 0? -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC "Vivir y dejar de vivir son soluciones imaginarias. La existencia está en otra parte" (Andre Breton) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL
On Tue, Jul 03, 2007 at 11:49:05AM -0400, Bruce Momjian wrote: > Tom Lane wrote: > > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > > Tom Lane escribi?: > > >> I rather doubt that. The most likely implementation would involve > > >> cloning a "template" entry into pg_class. > > > > > How about a new relkind which causes the table to be located in > > > PGDATA/base//pg_temp_/ > > > So each backend can have its own copy of the table with the same > > > relfilenode; there's no need for extra catalog entries. > > > > Uh-huh. And what do you do with relpages, reltuples, relfrozenxid, and > > pg_statistic entries? What if one backend wants to TRUNCATE or CLUSTER > > its copy (requiring a new relfilenode)? Where does ALTER TABLE fit into > > this? > > And what is the use-case for this functionality? What does it give us > that we don't already have? The use case is any system that uses temp tables in an OLTP setting, which certainly isn't uncommon. The problem is that today (and as well with a global temp table that is still writing to the catalogs) is that every OLTP operation that creates or drops a temp table is doing DDL. At best, that leads to a lot of catalog bloat. Right now, it appears to also expose some race conditions (we've got a customer that's been bit by this and we've been able to reproduce some odd behavior in the lab). -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgp6y7cHzcrFd.pgp Description: PGP signature
Re: [HACKERS] Still recommending daily vacuum...
>>> On Tue, Jul 3, 2007 at 3:36 PM, in message <[EMAIL PROTECTED]>, Tom Lane <[EMAIL PROTECTED]> wrote: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: >> On Mon, Jul 02, 2007 at 11:19:12PM -0400, Tom Lane wrote: >>> Is there a reason to say anything beyond "use autovac"? > >> There is; I know that things like web session tables aren't handled very >> well by autovacuum if there are any moderately large tables (anything >> that will take more than a few minutes to vacuum). Eventually we should >> be able to accommodate that case with multiple workers, but we'll need a >> mechanism to ensure that at least one worker doesn't get tied up in >> large vacuums. > > And which part of that do you think isn't resolved in 8.3? We have a 406GB table where 304GB is in one table. The next two tables are 57GB and 40GB. Inserts to these three tables are constant during the business day, along with inserts, updates, and very few deletes to the other tables. Database modifications are few and scattered at night and on weekends. Virtually all queries are during the business day. The large tables are "insert only" except for a weekend delete of the oldest one week of data, to keep a rolling set of just over a year. (No, we really don't want to go to weekly partitions, if it can be avoided.) Autovacuum is enabled with very aggressive settings, to cover small tables, including one with about 75 rows that can be updated 100 or more times per second. Even with these settings there is zero chance of any table of even moderate size hitting the autovacuum threshold between our scheduled vacuums. When we tried doing a nightly vacuum analyze starting at the end of business day, it ran well into the next day, and the users complained of slowness until it stopped. We changed to a weeknight vacuum analyze of the volatile tables which aren't in the big three, and a vacuum analyze of the entire database right after the weekly delete. Isn't this a use case where we don't want to count on autovacuum, both from a table bloat perspective and the user impact perspective, even under 8.3? In terms of our autovacuum settings, we have several different types of databases, and in all of them we seem to do well with these changes from the 8.2 defaults, combined with (except for the above configuration) a nightly database vacuum: autovacuum_naptime = 10s autovacuum_vacuum_threshold = 1 autovacuum_analyze_threshold = 1 Oh, the tiny, high-update tables occasionally bloat to hundreds or thousands of pages because of long-running transactions, so we schedule a daily cluster on those, just to keep things tidy. -Kevin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Proposal: In-Place upgrade concept
"Tom Lane" <[EMAIL PROTECTED]> writes: > 2. Page format conversion is WAL-logged as a complete page replacement It seems we get that for free. By definition any modification to a page after conversion will be the first record for that page since a checkpoint. It just means we have to force full_page_writes to be true, at least for a buffer that's been converted if not whenever we're processing a mixed format database. > On the whole I think we could define format conversions for user-defined > types as "not our problem". Hm, perhaps we could do it by doing as much of the work up-front as possible. Requiring datatypes to define a function in pg_proc to do the conversion and require that it be a function which can stand on its own without catalog lookups or other facilities. Then when we start up on a mixed format database we load all those functions into a hash by typeoid and provide that hash to bufmgr. By definition any datatypes we expect to find in the database must already be in the catalog before we start. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Still recommending daily vacuum...
>>> On Tue, Jul 3, 2007 at 5:17 PM, in message <[EMAIL PROTECTED]>, "Kevin Grittner" <[EMAIL PROTECTED]> wrote: > > We have a 406GB table where 304GB is in one table. The next two tables It's probably obvious, but I meant a 406GB database. Sorry. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Still recommending daily vacuum...
Kevin Grittner wrote: > We have a 406GB table where 304GB is in one table. The next two tables > are 57GB and 40GB. Inserts to these three tables are constant during the > business day, along with inserts, updates, and very few deletes to the > other tables. Database modifications are few and scattered at night and > on weekends. Virtually all queries are during the business day. The > large tables are "insert only" except for a weekend delete of the oldest > one week of data, to keep a rolling set of just over a year. (No, we > really don't want to go to weekly partitions, if it can be avoided.) > > Autovacuum is enabled with very aggressive settings, to cover small > tables, including one with about 75 rows that can be updated 100 or more > times per second. Even with these settings there is zero chance of any > table of even moderate size hitting the autovacuum threshold between our > scheduled vacuums. When we tried doing a nightly vacuum analyze starting > at the end of business day, it ran well into the next day, and the users > complained of slowness until it stopped. We changed to a weeknight vacuum > analyze of the volatile tables which aren't in the big three, and a vacuum > analyze of the entire database right after the weekly delete. Sounds like you would be served by setting those specific tables to a lower vacuum scale factor (keeping a more normal default for the rest of the tables), and having a non-zero vacuum delay setting (to avoid excessive I/O consumption). Have you tried that? The problem you would still have with 8.2 is that while one of these tables is being vacuumed the rest won't be vacuumed at all. In 8.3 the other tables can still be vacuumed regularly with the big vacuum still running (a feature I dubbed "multiple workers", but we're still waiting to know what name the marketing guys are gonna use). > In terms of our autovacuum settings, we have several different types of > databases, and in all of them we seem to do well with these changes from > the 8.2 defaults, combined with (except for the above configuration) a > nightly database vacuum: > > autovacuum_naptime = 10s Another change in 8.3 is that the naptime is per-database, i.e. the time between two consecutive autovac runs on a database. So with a setting of 10s, if you have 10 database there will be one autovac run per second, whereas on 8.2 there would be one autovac each 10 seconds (unless you run out of worker slots). > Oh, the tiny, high-update tables occasionally bloat to hundreds or > thousands of pages because of long-running transactions, so we schedule > a daily cluster on those, just to keep things tidy. If you can afford the cluster then there's no problem. I don't expect that to change in 8.3. -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC "Before you were born your parents weren't as boring as they are now. They got that way paying your bills, cleaning up your room and listening to you tell them how idealistic you are." -- Charles J. Sykes' advice to teenagers ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Proposal: In-Place upgrade concept
Gregory Stark <[EMAIL PROTECTED]> writes: > "Tom Lane" <[EMAIL PROTECTED]> writes: >> 2. Page format conversion is WAL-logged as a complete page replacement > It seems we get that for free. By definition any modification to a page after > conversion will be the first record for that page since a checkpoint. No, I don't think so. Consider a page that's been sucked in for just a SELECT. Or did you plan to pay the page conversion cost over and over again until it's finally modified? Anyway the whole idea fails if the first mod that needs to be made is just a hint-bit update. > Hm, perhaps we could do it by doing as much of the work up-front as possible. > Requiring datatypes to define a function in pg_proc to do the conversion and > require that it be a function which can stand on its own without catalog > lookups or other facilities. I don't think you quite got the point of "no catalog lookups". That includes not looking up pg_type or pg_proc entries. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: In-Place upgrade concept
Richard Huxton <[EMAIL PROTECTED]> writes: > One thing no-one's mentioned is how we're going to deal with definitive > incompatibilities. I don't really think that in-place update changes that story at all. The advice has always been "read the release notes and test your applications before updating". The only thing that in-place update will offer is shorter update downtime than dump/reload. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposal: In-Place upgrade concept
"Tom Lane" <[EMAIL PROTECTED]> writes: > Gregory Stark <[EMAIL PROTECTED]> writes: >> "Tom Lane" <[EMAIL PROTECTED]> writes: >>> 2. Page format conversion is WAL-logged as a complete page replacement > >> It seems we get that for free. By definition any modification to a page after >> conversion will be the first record for that page since a checkpoint. > > No, I don't think so. Consider a page that's been sucked in for just a > SELECT. Or did you plan to pay the page conversion cost over and over > again until it's finally modified? Anyway the whole idea fails if the > first mod that needs to be made is just a hint-bit update. Ah, true, it would have to dirty the page and that means putting an LSN in and that means needing a WAL record to point it to. But the actual record is almost irrelevant as any record will cause the full page to be attached. >> Hm, perhaps we could do it by doing as much of the work up-front as possible. >> Requiring datatypes to define a function in pg_proc to do the conversion and >> require that it be a function which can stand on its own without catalog >> lookups or other facilities. > > I don't think you quite got the point of "no catalog lookups". That > includes not looking up pg_type or pg_proc entries. ISTM that it's really just not looking up pg_type or pg_proc entries *from within the low level storage code* that's evil. If they're all sucked in on backend startup and restricted by policy to not be able to do anything fancy themselves then I don't see your objection. I do have an objection though: TOAST data. Even were the bufmgr to be provided with a mechanism to find a data type conversion function I don't see what it would do about a toasted datum. Obviously bufmgr can't fiddle with it, and if it leaves it for pg_detoast_datum then even if there were some way for pg_detoast_datum to know whether it's an old format datum or a new format one there's no way for it replace the TOAST datum on disk with the new one. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Proposal: In-Place upgrade concept
Gregory Stark <[EMAIL PROTECTED]> writes: > I do have an objection though: TOAST data. Even were the bufmgr to be > provided with a mechanism to find a data type conversion function I > don't see what it would do about a toasted datum. Urgh, that *is* a nasty thought :-( Out-of-line datums aren't the only issue, either: consider inline compressed datums. A data representation change, even one that is known not to increase the ordinary uncompressed size of the datum, could easily render it slightly less compressible, resulting in a risk that the tuple doesn't fit on the page anymore. It hardly seems practical (maybe not even possible) to guarantee that this cannot happen. So maybe we are up against the conclusion that in-place updates cannot support datatype representation changes, at least not for toastable datatypes. We could still handle 'em by the expedient suggested upthread for user-defined types, ie the "new representation" is treated as a whole new type. That's not terribly appetizing though; I had expected we could be more efficient for the case of changes in built-in types. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Why so many out-of-disk-space failures on buildfarm machines?
It seems like we see a remarkable number of occurrences of $subject. For instance, right now we have these members failing on various branches: echidna No space left on device asp No space left on device herring No space left on device (icc seems particularly unable to cope with this, or at least I suspect that's the reason for some builds failing with that bizarre message) kitegcc quoth "Internal compiler error: Segmentation fault" wildebeest long-standing configuration error (no Tk installed) wombat long-standing configuration error (no Tk installed) I realize that a lot of these members are running on old underpowered machines with not so much disk, but is it possible that the buildfarm itself is leaking disk space? Not cleaning up log files for instance? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Still recommending daily vacuum...
Alvaro Herrera wrote: Jim C. Nasby wrote: FWIW, I normally go with the 8.2 defaults, though I could see dropping vacuum_scale_factor down to 0.1 or 0.15. I also think the thresholds could be decreased further, maybe divide by 10. How about pushing thresholds all the way down to 0? As long as it handles small (or zero row) tables ok then yes. The base_threshold in the originial contrib autovacuum was just an easy way to not vacuum really small tables too often. If a table has only 10 rows, it's going to get vacuumed every time one row is updated. I guess that's not a big problem with a table that small but still seems excessive. If you think this isn't a problem with the current autovacuum, then sure turn it down to zero, and perhaps we can even get rid of it altogether in another release or two. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Why so many out-of-disk-space failures on buildfarm machines?
On Tue, 3 Jul 2007, Tom Lane wrote: I realize that a lot of these members are running on old underpowered machines with not so much disk, but is it possible that the buildfarm itself is leaking disk space? Not cleaning up log files for instance? No, the buildfarm does not leak disk space. It is possible that members are configured with --keepall, which keeps the entire directory tree if a failure occurs. That can fill up a lot of space quickly when you get a failure. Kris Jurka ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL
> The use case is any system that uses temp tables in an OLTP setting, > which certainly isn't uncommon. The problem is that today (and as well > with a global temp table that is still writing to the catalogs) is that > every OLTP operation that creates or drops a temp table is doing DDL. > At best, that leads to a lot of catalog bloat. Right now, it appears to > also expose some race conditions (we've got a customer that's been bit > by this and we've been able to reproduce some odd behavior in the lab). The solution is to fix the bloat, not add a work-around. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org