Re: [HACKERS] [GENERAL] Autovacuum Improvements
On Mon, Jan 22, 2007 at 05:51:53PM +, Gregory Stark wrote: > Actually no. A while back I did experiments to see how fast reading a file > sequentially was compared to reading the same file sequentially but skipping > x% of the blocks randomly. The results were surprising (to me) and depressing. > The breakeven point was about 7%. I asusume this means you were reading 7% of the blocks, not skipping 7% of the blocks when you broke even? I presume by break-even you mean it took just as long, time-wise. But did it have the same effect on system load? If reading only 7% of the blocks allows the drive to complete other requests more quickly then it's beneficial, even if the vacuum takes longer. This may be a silly thought, I'm not sure how drives handle multiple requests... Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Updateable cursors
Lukas Kahwe Smith wrote: > Joshua D. Drake wrote: > >> Great! I will put it on my, "Remember to bug Arul" list :) > > Hey Joshua, > > could you put this stuff here: > http://developer.postgresql.org/index.php/Todo:WishlistFor83 Sure if you bother to unlock the page for me ;) > > I will try to find some time during this week (likely on the weekend) to > also try and figure out if these items are real and if the people still > think they can do them for 8.3 .. your additions would be most helpful. > > regards, > Lukas > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_dump ANALYZE statements
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > On Mon, Jan 22, 2007 at 03:40:17PM +, Simon Riggs wrote: >> This would add a table-specific ANALYZE statement following each table's >> actions. > It'd probably be best to put it before any index creating activities, No, because then you'd fail to accumulate any stats on partial or functional indexes. There's been talk of using the presence of multi-column indexes to guide creation of cross-column statistics, too. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Updateable cursors
Joshua D. Drake wrote: Great! I will put it on my, "Remember to bug Arul" list :) Hey Joshua, could you put this stuff here: http://developer.postgresql.org/index.php/Todo:WishlistFor83 I will try to find some time during this week (likely on the weekend) to also try and figure out if these items are real and if the people still think they can do them for 8.3 .. your additions would be most helpful. regards, Lukas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Default permissisons from schemas
Stephen Frost <[EMAIL PROTECTED]> writes: > Following up on my reply to Joshua, what I'd like to propose is, for > comments and suggestions: > ALTER SCHEMA name [ [ WITH ] [ DEFAULT ] option [ ... ] ] > where option can be: > { GRANT { { SELECT | INSERT | UPDATE | DELETE | RULE | REFERENCES | TRI= > GGER | EXECUTE }=20 > [,...] | ALL [ PRIVILEGES ] }=20 > TO { role | PUBLIC } [, ...] [ WITH GRANT OPTION ]=20 > } [, ...] > OWNER role This seems to ignore the problem that different types of objects have different privileges. E.g., if I want to grant USAGE on all sequences that doesn't necessarily mean I want to grant USAGE on all languages. > When not-null the 'nspdefowner' would be the owner of all > objects created in the schema. Whoa. You are going to allow people to create objects owned by someone else? I don't think so ... most Unix systems have forbidden object give-away for years, for very good reasons. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Updateable cursors
FAST PostgreSQL wrote: > On Tue, 23 Jan 2007 15:48, Joshua D. Drake wrote: >> FAST PostgreSQL wrote: >>> We are trying to develop the updateable cursors functionality into >>> Postgresql. I have given below details of the design and also issues we >>> are facing. Looking forward to the advice on how to proceed with these >>> issues. >>> >>> Rgds, >>> Arul Shaji >> Would this be something that you would hope to submit for 8.3? > > Yes definitely. If we can finish it before the feature freeze of course. Great! I will put it on my, "Remember to bug Arul" list :) Sincerely, Joshua D. Drake > > Rgds, > Arul Shaji > > >> Joshua D. Drake >> >>> 1. Introduction >>> -- >>> This is a combined proposal and design document for adding updatable >>> (insensitive) cursor capability to the PostgreSQL database. >>> There have already been a couple of previous proposals since 2003 for >>> implementing this feature so there appears to be community interest in >>> doing so. This will enable the following constructs to be processed: >>> >>> >>> UPDATE SET value_list WHERE CURRENT OF >>> DELETE FROM WHERE CURRENT OF >>> >>> This has the effect of users being able to update or delete specific rows >>> of a table, as defined by the row currently fetched into the cursor. >>> >>> >>> 2. Overall Conceptual Design >>> - >>> The design is considered from the viewpoint of progression of a command >>> through the various stages of processing, from changes to the file >>> ?gram.y? to implement the actual grammar changes, through to changes in >>> the Executor portion of the database architecture. >>> >>> 2.1 Changes to the Grammar >>> -- >>> The following changes will be done to the PostgreSQL grammar: >>> >>> UPDATE statement has the option ?WHERE CURRENT OF ? added >>> DELETE statement has the option ?WHERE CURRENT OF ? added >>> >>> The cursor_name data is held in the UpdateStmt and DeleteStmt structures >>> and contains just the name of the cursor. >>> >>> The pl/pgsql grammar changes in the same manner. >>> >>> The word CURRENT will be added to the ScanKeywords array in keywords.c. >>> >>> >>> 2.2 Changes to Affected Data Structures >>> -- >>> The following data structures are affected by this change: >>> >>> Portal structure, QueryDesc structure, the UpdateStmt and DeleteStmt >>> structures >>> >>> The Portal will contain a list of structures of relation ids and tuple >>> ids relating to the tuple held in the QueryDesc structure. There will be >>> one entry in the relation and tuple id list for each entry in the >>> relation-list of the statement below: >>> >>> DECLARE [WITH HOLD] SELECT FOR UPDATE OF >>> >>> The QueryDesc structure will contain the relation id and the tuple id >>> relating to the tuple obtained via the FETCH command so that it can be >>> propagated back to the Portal for storage in the list described above. >>> >>> The UpdateStmt and DeleteStmt structures have the cursor name added so >>> that the information is available for use in obtaining the portal >>> structure related to the cursor previously opened via the DECLARE CURSOR >>> request. >>> >>> >>> 2.3 Changes to the SQL Parser >>> >>> At present, although the FOR UPDATE clause of the DECLARE CURSOR command >>> has been present in the grammar, it causes an error message later in the >>> processing since cursors are currently not updatable. This now needs to >>> change. The ?FOR UPDATE? clause has to be valid, but not the ?FOR SHARE? >>> clause. >>> >>> The relation names that follow the ?FOR UPDATE? clause will be added to >>> the rtable in the Query structure and identified by means of the rowMarks >>> array. In the case of an updatable cursor the FOR SHARE option is not >>> allowed therefore all entries in the rtable that are identified by the >>> rowMarks array must relate to tables that are FOR UPDATE. >>> >>> In the UPDATE or DELETE statements the ?WHERE CURRENT OF ? >>> clause results in the cursor name being placed in the UpdateStmt or >>> DeleteStmt structure. During the processing of the functions - >>> transformDeleteStmt() and transformUpdateStmt() - the cursor name is used >>> to obtain a pointer to the related Portal structure and the tuple >>> affected by the current UPDATE or DELETE statement is extracted from the >>> Portal, where it has been placed as the result of a previous FETCH >>> request. At this point all the information for the UPDATE or DELETE >>> statement is available so the statements can be transformed into standard >>> UPDATE or DELETE statements and sent for re-write/planning/execution as >>> usual. >>> >>> 2.4 Changes to the Optimizer >>> -- >>> There is a need to add a TidScan node to planning UPDATE / DELETE >>> statements where the statements are ?UPDATE / DELETE at position?. This >>> is to enable the tuple ids of the tuples in the tables
Re: [HACKERS] Updateable cursors
On Tue, 23 Jan 2007 15:48, Joshua D. Drake wrote: > FAST PostgreSQL wrote: > > We are trying to develop the updateable cursors functionality into > > Postgresql. I have given below details of the design and also issues we > > are facing. Looking forward to the advice on how to proceed with these > > issues. > > > > Rgds, > > Arul Shaji > > Would this be something that you would hope to submit for 8.3? Yes definitely. If we can finish it before the feature freeze of course. Rgds, Arul Shaji > Joshua D. Drake > > > 1. Introduction > > -- > > This is a combined proposal and design document for adding updatable > > (insensitive) cursor capability to the PostgreSQL database. > > There have already been a couple of previous proposals since 2003 for > > implementing this feature so there appears to be community interest in > > doing so. This will enable the following constructs to be processed: > > > > > > UPDATE SET value_list WHERE CURRENT OF > > DELETE FROM WHERE CURRENT OF > > > > This has the effect of users being able to update or delete specific rows > > of a table, as defined by the row currently fetched into the cursor. > > > > > > 2. Overall Conceptual Design > > - > > The design is considered from the viewpoint of progression of a command > > through the various stages of processing, from changes to the file > > ?gram.y? to implement the actual grammar changes, through to changes in > > the Executor portion of the database architecture. > > > > 2.1 Changes to the Grammar > > -- > > The following changes will be done to the PostgreSQL grammar: > > > > UPDATE statement has the option ?WHERE CURRENT OF ? added > > DELETE statement has the option ?WHERE CURRENT OF ? added > > > > The cursor_name data is held in the UpdateStmt and DeleteStmt structures > > and contains just the name of the cursor. > > > > The pl/pgsql grammar changes in the same manner. > > > > The word CURRENT will be added to the ScanKeywords array in keywords.c. > > > > > > 2.2 Changes to Affected Data Structures > > -- > > The following data structures are affected by this change: > > > > Portal structure, QueryDesc structure, the UpdateStmt and DeleteStmt > > structures > > > > The Portal will contain a list of structures of relation ids and tuple > > ids relating to the tuple held in the QueryDesc structure. There will be > > one entry in the relation and tuple id list for each entry in the > > relation-list of the statement below: > > > > DECLARE [WITH HOLD] SELECT FOR UPDATE OF > > > > The QueryDesc structure will contain the relation id and the tuple id > > relating to the tuple obtained via the FETCH command so that it can be > > propagated back to the Portal for storage in the list described above. > > > > The UpdateStmt and DeleteStmt structures have the cursor name added so > > that the information is available for use in obtaining the portal > > structure related to the cursor previously opened via the DECLARE CURSOR > > request. > > > > > > 2.3 Changes to the SQL Parser > > > > At present, although the FOR UPDATE clause of the DECLARE CURSOR command > > has been present in the grammar, it causes an error message later in the > > processing since cursors are currently not updatable. This now needs to > > change. The ?FOR UPDATE? clause has to be valid, but not the ?FOR SHARE? > > clause. > > > > The relation names that follow the ?FOR UPDATE? clause will be added to > > the rtable in the Query structure and identified by means of the rowMarks > > array. In the case of an updatable cursor the FOR SHARE option is not > > allowed therefore all entries in the rtable that are identified by the > > rowMarks array must relate to tables that are FOR UPDATE. > > > > In the UPDATE or DELETE statements the ?WHERE CURRENT OF ? > > clause results in the cursor name being placed in the UpdateStmt or > > DeleteStmt structure. During the processing of the functions - > > transformDeleteStmt() and transformUpdateStmt() - the cursor name is used > > to obtain a pointer to the related Portal structure and the tuple > > affected by the current UPDATE or DELETE statement is extracted from the > > Portal, where it has been placed as the result of a previous FETCH > > request. At this point all the information for the UPDATE or DELETE > > statement is available so the statements can be transformed into standard > > UPDATE or DELETE statements and sent for re-write/planning/execution as > > usual. > > > > 2.4 Changes to the Optimizer > > -- > > There is a need to add a TidScan node to planning UPDATE / DELETE > > statements where the statements are ?UPDATE / DELETE at position?. This > > is to enable the tuple ids of the tuples in the tables relating to the > > query to be obtained. There will need to be a new mechanism to achieve > > this, as at present, a Tid scan is
Re: [HACKERS] autovacuum process handling
Jim C. Nasby wrote: > On Mon, Jan 22, 2007 at 04:24:28PM -0300, Alvaro Herrera wrote: > > 4. Launcher will be a continuously-running process, akin to bgwriter; > > connected to shared memory > > So would it use up a database connection? No. It's connected to shared memory and has access to pgstats, but it's not connected to any database so it's not counted. You'd say it has the same status as the bgwriter. > > 5. Workers will be direct postmaster children; so postmaster will get > > SIGCHLD when worker dies > > As part of this I think we need to make it more obvious how all of this > ties into max_connections. Currently, autovac ties up one of the > super-user connections whenever it's not asleep; these changes would > presumably mean that more of those connections could be tied up. Sure. > Rather than forcing users to worry about adjusting max_connections and > superuser_reserved_connections to accommodate autovacuum, the system > should handle it for them. > > Were you planning on limiting the number of concurrent vacuum processes > that could be running? If so, we could probably just increase superuser > connections by that amount. If not, we might need to think of something > else... The fact that I'm currently narrowly focused on process handling means that I don't want to touch scheduling at all for now, so I'm gonna make it so that the launcher decides to launch a worker run only when no other worker is running. Thus only a single vacuum "thread" at any time. In the meantime you're welcome to think on the possible solutions to that problem, which we'll have to attack at some point in the (hopefully) near future ;-) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Updateable cursors
FAST PostgreSQL wrote: > We are trying to develop the updateable cursors functionality into > Postgresql. I have given below details of the design and also issues we are > facing. Looking forward to the advice on how to proceed with these issues. > > Rgds, > Arul Shaji Would this be something that you would hope to submit for 8.3? Joshua D. Drake > > > > > > 1. Introduction > -- > This is a combined proposal and design document for adding updatable > (insensitive) cursor capability to the PostgreSQL database. > There have already been a couple of previous proposals since 2003 for > implementing this feature so there appears to be community interest in doing > so. This will enable the following constructs to be processed: > > > UPDATE SET value_list WHERE CURRENT OF > DELETE FROM WHERE CURRENT OF > > This has the effect of users being able to update or delete specific rows of > a table, as defined by the row currently fetched into the cursor. > > > 2. Overall Conceptual Design > - > The design is considered from the viewpoint of progression of a command > through the various stages of processing, from changes to the file ‘gram.y’ > to implement the actual grammar changes, through to changes in the Executor > portion of the database architecture. > > 2.1 Changes to the Grammar > -- > The following changes will be done to the PostgreSQL grammar: > > UPDATE statement has the option ‘WHERE CURRENT OF ’ added > DELETE statement has the option ‘WHERE CURRENT OF ’ added > > The cursor_name data is held in the UpdateStmt and DeleteStmt structures and > contains just the name of the cursor. > > The pl/pgsql grammar changes in the same manner. > > The word CURRENT will be added to the ScanKeywords array in keywords.c. > > > 2.2 Changes to Affected Data Structures > -- > The following data structures are affected by this change: > > Portal structure, QueryDesc structure, the UpdateStmt and DeleteStmt > structures > > The Portal will contain a list of structures of relation ids and tuple ids > relating to the tuple held in the QueryDesc structure. There will be one > entry in the relation and tuple id list for each entry in the relation-list > of the statement below: > > DECLARE [WITH HOLD] SELECT FOR UPDATE OF > > The QueryDesc structure will contain the relation id and the tuple id > relating to the tuple obtained via the FETCH command so that it can be > propagated back to the Portal for storage in the list described above. > > The UpdateStmt and DeleteStmt structures have the cursor name added so that > the information is available for use in obtaining the portal structure > related to the cursor previously opened via the DECLARE CURSOR request. > > > 2.3 Changes to the SQL Parser > > At present, although the FOR UPDATE clause of the DECLARE CURSOR command has > been present in the grammar, it causes an error message later in the > processing since cursors are currently not updatable. This now needs to > change. The ‘FOR UPDATE’ clause has to be valid, but not the ‘FOR SHARE’ > clause. > > The relation names that follow the ‘FOR UPDATE’ clause will be added to the > rtable in the Query structure and identified by means of the rowMarks array. > In the case of an updatable cursor the FOR SHARE option is not allowed > therefore all entries in the rtable that are identified by the rowMarks array > must relate to tables that are FOR UPDATE. > > In the UPDATE or DELETE statements the ‘WHERE CURRENT OF ’ > clause results in the cursor name being placed in the UpdateStmt or > DeleteStmt structure. During the processing of the functions - > transformDeleteStmt() and transformUpdateStmt() - the cursor name is used to > obtain a pointer to the related Portal structure and the tuple affected by > the current UPDATE or DELETE statement is extracted from the Portal, where it > has been placed as the result of a previous FETCH request. At this point all > the information for the UPDATE or DELETE statement is available so the > statements can be transformed into standard UPDATE or DELETE statements and > sent for re-write/planning/execution as usual. > > 2.4 Changes to the Optimizer > -- > There is a need to add a TidScan node to planning UPDATE / DELETE statements > where the statements are ‘UPDATE / DELETE at position’. This is to enable the > tuple ids of the tuples in the tables relating to the query to be obtained. > There will need to be a new mechanism to achieve this, as at present, a Tid > scan is done only if there is a standard WHERE condition on update or delete > statements to provide Tid qualifier data. > > > 2.5 Changes to the Executor > --- > There are various options that have been considered for this part of the > enhancement. Th
[HACKERS] Updateable cursors
We are trying to develop the updateable cursors functionality into Postgresql. I have given below details of the design and also issues we are facing. Looking forward to the advice on how to proceed with these issues. Rgds, Arul Shaji 1. Introduction -- This is a combined proposal and design document for adding updatable (insensitive) cursor capability to the PostgreSQL database. There have already been a couple of previous proposals since 2003 for implementing this feature so there appears to be community interest in doing so. This will enable the following constructs to be processed: UPDATE SET value_list WHERE CURRENT OF DELETE FROM WHERE CURRENT OF This has the effect of users being able to update or delete specific rows of a table, as defined by the row currently fetched into the cursor. 2. Overall Conceptual Design - The design is considered from the viewpoint of progression of a command through the various stages of processing, from changes to the file gram.y to implement the actual grammar changes, through to changes in the Executor portion of the database architecture. 2.1 Changes to the Grammar -- The following changes will be done to the PostgreSQL grammar: UPDATE statement has the option WHERE CURRENT OF added DELETE statement has the option WHERE CURRENT OF added The cursor_name data is held in the UpdateStmt and DeleteStmt structures and contains just the name of the cursor. The pl/pgsql grammar changes in the same manner. The word CURRENT will be added to the ScanKeywords array in keywords.c. 2.2 Changes to Affected Data Structures -- The following data structures are affected by this change: Portal structure, QueryDesc structure, the UpdateStmt and DeleteStmt structures The Portal will contain a list of structures of relation ids and tuple ids relating to the tuple held in the QueryDesc structure. There will be one entry in the relation and tuple id list for each entry in the relation-list of the statement below: DECLARE [WITH HOLD] SELECT FOR UPDATE OF The QueryDesc structure will contain the relation id and the tuple id relating to the tuple obtained via the FETCH command so that it can be propagated back to the Portal for storage in the list described above. The UpdateStmt and DeleteStmt structures have the cursor name added so that the information is available for use in obtaining the portal structure related to the cursor previously opened via the DECLARE CURSOR request. 2.3 Changes to the SQL Parser At present, although the FOR UPDATE clause of the DECLARE CURSOR command has been present in the grammar, it causes an error message later in the processing since cursors are currently not updatable. This now needs to change. The FOR UPDATE clause has to be valid, but not the FOR SHARE clause. The relation names that follow the FOR UPDATE clause will be added to the rtable in the Query structure and identified by means of the rowMarks array. In the case of an updatable cursor the FOR SHARE option is not allowed therefore all entries in the rtable that are identified by the rowMarks array must relate to tables that are FOR UPDATE. In the UPDATE or DELETE statements the WHERE CURRENT OF clause results in the cursor name being placed in the UpdateStmt or DeleteStmt structure. During the processing of the functions - transformDeleteStmt() and transformUpdateStmt() - the cursor name is used to obtain a pointer to the related Portal structure and the tuple affected by the current UPDATE or DELETE statement is extracted from the Portal, where it has been placed as the result of a previous FETCH request. At this point all the information for the UPDATE or DELETE statement is available so the statements can be transformed into standard UPDATE or DELETE statements and sent for re-write/planning/execution as usual. 2.4 Changes to the Optimizer -- There is a need to add a TidScan node to planning UPDATE / DELETE statements where the statements are UPDATE / DELETE at position. This is to enable the tuple ids of the tuples in the tables relating to the query to be obtained. There will need to be a new mechanism to achieve this, as at present, a Tid scan is done only if there is a standard WHERE condition on update or delete statements to provide Tid qualifier data. 2.5 Changes to the Executor --- There are various options that have been considered for this part of the enhancement. These are described in the sections below. We would like to hear opinions on which option is the best way to go or if none of these is acceptable, any alternate ideas ? Option 1 MVCC Via Continuous Searching of Database The Executor is to be changed in the following ways: 1) When the FETCH statement is executed the id
Re: [HACKERS] Piggybacking vacuum I/O
Heikki Linnakangas <[EMAIL PROTECTED]> wrote: > Vacuum is done in 3 phases: > 1. Scan heap > 2. Vacuum index > 3. Vacuum heap > A variation of the scheme would be to keep scanning pages that are in > cache, until the tid list reaches a predefined size, instead of keeping > track of which pages have already been seen. That would deal better with > tables with hot and cold spots, but it couldn't advance the relfrozenid > because there would be no guarantee that all pages are visited. Also, we > could start 1st phase of the next vacuum, while we're still in the 3rd > phase of previous one. ISTM, it is another DSM that has a tuple-level accuracy, not a page-level. One of the benefits is that we can skip the 1st phase of vacuum; We will have a TID list of dead tuples at the start of vacuum, so we can start from 2nd phase. I have another idea for use of TID lists -- Store the TIDs after the 1st or 2nd phase, and exit the vacuum. At the next vacuum, we will do both the previous 3rd phase and new 1st phase at once, so that I/Os are reduced (ndirtyblocks + nindexblocks) from (2*ndirtyblocks + nindexblocks) in average. We've already use a similar method in vacuuming btree indexes to collect recyclable empty pages. I think piggybacking of I/Os are very useful. Buffer manager helps us folding up some of I/Os, but explicit orders are more effective. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Default permissisons from schemas
Greetings, * Stephen Frost ([EMAIL PROTECTED]) wrote: > It seems unlikely that I'm going to have time at the rate things are > going but I was hoping to take a whack at default permissions/ownership > by schema. Kind of a umask-type thing but for schemas instead of roles > (though I've thought about it per role and that might also solve the > particular problem we're having atm). Following up on my reply to Joshua, what I'd like to propose is, for comments and suggestions: ALTER SCHEMA name [ [ WITH ] [ DEFAULT ] option [ ... ] ] where option can be: { GRANT { { SELECT | INSERT | UPDATE | DELETE | RULE | REFERENCES | TRIGGER | EXECUTE } [,...] | ALL [ PRIVILEGES ] } TO { role | PUBLIC } [, ...] [ WITH GRANT OPTION ] } [, ...] OWNER role pg_namespace would be modified to have two new columns, nspdefowner oid, and nspdefacl aclitem[]. When NULL these would have no effect. When not-null the 'nspdefowner' would be the owner of all objects created in the schema. When not-null the 'nspdefacl' would be the initial acl for the object (modified for what grants are valid for the specific type of object). These can only be changed by the schema owner and the 'OWNER role' must have create permissions in the schema. Ideally this would be checked when the ALTER SCHEMA is issued and then a dependency created for that. If that's not possible today then the rights check would be done when an object creation is attempted, possibly with a fall-back to check the current user's rights. The defaults would be NULL for these so there would be no change in behaviour unless specifically asked for. I believe this would cover the following to-do item: Allow GRANT/REVOKE permissions to be inherited by objects based on schema permissions Comments? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] 10 weeks to feature freeze (Pending Work)
* Joshua D. Drake ([EMAIL PROTECTED]) wrote: > Thought I would do a poll of what is happening in the world for 8.3. I have: Another thing which was mentioned previously which I'd really like to see happen (and was discussed on the list...) is replacing the Kerberos support with GSSAPI support and adding support for SSPI. Don't recall who had said they were looking into working on it though.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] 10 weeks to feature freeze (Pending Work)
* Joshua D. Drake ([EMAIL PROTECTED]) wrote: > Thought I would do a poll of what is happening in the world for 8.3. I have: It seems unlikely that I'm going to have time at the rate things are going but I was hoping to take a whack at default permissions/ownership by schema. Kind of a umask-type thing but for schemas instead of roles (though I've thought about it per role and that might also solve the particular problem we're having atm). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] 10 weeks to feature freeze (Pending Work)
"Joshua D. Drake" <[EMAIL PROTECTED]> wrote: > Thought I would do a poll of what is happening in the world for 8.3. I have: > > Alvaro Herrera: Autovacuum improvements (maintenance window etc..) > Gavin Sherry: Bitmap Indexes (on disk), possible basic Window functions > Jonah Harris: WITH/Recursive Queries? > Andrei Kovalesvki: Some Win32 work with Magnus > Magnus Hagander: VC++ support (thank goodness) > Heikki Linnakangas: Working on Vacuum for Bitmap Indexes? > Oleg Bartunov: Tsearch2 in core > Neil Conway: Patch Review (including enums), pg_fcache I'm working on Dead Space Map and Load-distribution of checkpoints. I will make it do by 8.3. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 10 weeks to feature freeze (Pending Work)
On Mon, 2007-01-22 at 14:16 -0800, Joshua D. Drake wrote: > I am sure there are more, the ones with question marks are unknowns but > heard of in the ether somewhere. Any additions or confirmations? > I'd still like to make an attempt at my Synchronized Scanning patch. If freeze is 10 weeks away, I better get some more test results posted soon, however. Regards, Jeff Davis ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Would this SPI function be useful?
Would it be useful to have an SPI function that returns the OID and namespace of the function being executed? The reason I bring this up is due to a discussion on the PostGIS lists about making the installation able to work in any namespace from one generic SQL script. The problem they have is that, because the functions don't know what schema they reside in, they don't know how to call out to other functions in the same namespace. It might be useful to have a few basic functions that allow developers of add-ons to access information like that. What come to my mind are: SPI_get_my_oid(); SPI_get_my_nspname(); Regards, Jeff Davis ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Strange file in snapshot tarball
Michael Meskes <[EMAIL PROTECTED]> wrote: > > There is a file 'compat_informix-dec_test-OpenBSD3.8.broken.stdout' > > under the path 'src/interfaces/ecpg/test/extedted/' in the recent > > ftp snapshot (postgresql-snapshot.tar.gz). > > Would you please care to explain? I do not understand what you are > saying. Oops, sorry. It's my extractor's failure. The file is the only file that name is longer than 100 characters in the tarball. My extractor does not support long name files. I'm sorry for all the fuss. --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] -f option for pg_dumpall
elein wrote: > On Mon, Jan 15, 2007 at 10:13:16AM -0500, Andrew Dunstan wrote: >> >> >> Neil Conway wrote: >> >On Thu, 2007-01-11 at 14:36 -0500, Neil Conway wrote: >> > >> >>I don't think they need to be integrated any time soon, but if we were >> >>to design pg_dump and pg_dumpall from scratch, it seems more logical >> to >> >>use a single program >> >> >> > >> >On thinking about this some more, it might be useful to factor much of >> >pg_dump's logic for reconstructing the state of a database into a >> shared >> >library. This would make it relatively easy for developers to plug new >> >archive formats into the library (in addition to the present 3 archive >> >formats), or to make use of this functionality in other applications >> >that want to reconstruct the logical state of a database from the >> >content of the system catalogs. We could then provide a client app >> >implemented on top of the library that would provide similar >> >functionality to pg_dump. >> > >> >Moving pg_dump's functionality into the backend has been suggested in >> >the past (and rejected for good reason), but I think this might be a >> >more practical method for making the pg_dump logic more easily >> reusable. >> > >> > >> > >> >> I like this idea. For example, we might usefully map some of this to >> psql \ commands, without having to replicate the underlying logic. > > Don't we already do this with the .psqlrc file? > No. \ commands are implemented in C code. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] -f option for pg_dumpall
On Mon, Jan 15, 2007 at 10:13:16AM -0500, Andrew Dunstan wrote: > > > Neil Conway wrote: > >On Thu, 2007-01-11 at 14:36 -0500, Neil Conway wrote: > > > >>I don't think they need to be integrated any time soon, but if we were > >>to design pg_dump and pg_dumpall from scratch, it seems more logical to > >>use a single program > >> > > > >On thinking about this some more, it might be useful to factor much of > >pg_dump's logic for reconstructing the state of a database into a shared > >library. This would make it relatively easy for developers to plug new > >archive formats into the library (in addition to the present 3 archive > >formats), or to make use of this functionality in other applications > >that want to reconstruct the logical state of a database from the > >content of the system catalogs. We could then provide a client app > >implemented on top of the library that would provide similar > >functionality to pg_dump. > > > >Moving pg_dump's functionality into the backend has been suggested in > >the past (and rejected for good reason), but I think this might be a > >more practical method for making the pg_dump logic more easily reusable. > > > > > > > > I like this idea. For example, we might usefully map some of this to > psql \ commands, without having to replicate the underlying logic. Don't we already do this with the .psqlrc file? --elein ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 10 weeks to feature freeze (Pending Work)
> Thought I would do a poll of what is happening in the world for 8.3. I have: > > Alvaro Herrera: Autovacuum improvements (maintenance window etc..) > Gavin Sherry: Bitmap Indexes (on disk), possible basic Window functions > Jonah Harris: WITH/Recursive Queries? > Andrei Kovalesvki: Some Win32 work with Magnus > Magnus Hagander: VC++ support (thank goodness) > Heikki Linnakangas: Working on Vacuum for Bitmap Indexes? > Oleg Bartunov: Tsearch2 in core > Neil Conway: Patch Review (including enums), pg_fcache Korry Douglas: PL/pgSQL debugger (and probably a PL/pgSQL execution profiler as well) -- Korry Douglas[EMAIL PROTECTED] EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] [GENERAL] Autovacuum Improvements
On Jan 22, 2007, at 11:16 AM, Richard Huxton wrote: Bruce Momjian wrote: Yep, agreed on the random I/O issue. The larger question is if you have a huge table, do you care to reclaim 3% of the table size, rather than just vacuum it when it gets to 10% dirty? I realize the vacuum is going to take a lot of time, but vacuuming to relaim 3% three times seems like it is going to be more expensive than just vacuuming the 10% once. And vacuuming to reclaim 1% ten times seems even more expensive. The partial vacuum idea is starting to look like a loser to me again. Buying a house with a 25-year mortgage is much more expensive than just paying cash too, but you don't always have a choice. Surely the key benefit of the partial vacuuming thing is that you can at least do something useful with a large table if a full vacuum takes 24 hours and you only have 4 hours of idle I/O. It's also occurred to me that all the discussion of scheduling way back when isn't directly addressing the issue. What most people want (I'm guessing) is to vacuum *when the user-workload allows* and the time-tabling is just a sysadmin first-approximation at that. Yup. I'd really like for my app to be able to say "Hmm. No interactive users at the moment, no critical background tasks. Now would be a really good time for the DB to do some maintenance." but also to be able to interrupt the maintenance process if some new users or other system load show up. With partial vacuuming possible, we can arrange things with just three thresholds and two measurements: Measurement 1 = system workload Measurement 2 = a per-table "requires vacuuming" value Threshold 1 = workload at which we do more vacuuming Threshold 2 = workload at which we do less vacuuming Threshold 3 = point at which a table is considered worth vacuuming. Once every 10 seconds, the manager compares the current workload to the thresholds and starts a new vacuum, kills one or does nothing. New vacuum processes keep getting started as long as there is workload spare and tables that need vacuuming. Now the trick of course is how you measure system workload in a meaningful manner. I'd settle for a "start maintenance", "stop maintenance" API. Anything else (for instance the heuristics you suggest above) would definitely be gravy. It's not going to be simple to do, though, I don't think. Cheers, Steve ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] autovacuum process handling
On Mon, Jan 22, 2007 at 04:24:28PM -0300, Alvaro Herrera wrote: > 4. Launcher will be a continuously-running process, akin to bgwriter; > connected to shared memory So would it use up a database connection? > 5. Workers will be direct postmaster children; so postmaster will get > SIGCHLD when worker dies As part of this I think we need to make it more obvious how all of this ties into max_connections. Currently, autovac ties up one of the super-user connections whenever it's not asleep; these changes would presumably mean that more of those connections could be tied up. Rather than forcing users to worry about adjusting max_connections and superuser_reserved_connections to accommodate autovacuum, the system should handle it for them. Were you planning on limiting the number of concurrent vacuum processes that could be running? If so, we could probably just increase superuser connections by that amount. If not, we might need to think of something else... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [pgsql-patches] [HACKERS] Win32 WEXITSTATUS too
bruce wrote: > Tom Lane wrote: > > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > > Bruce Momjian wrote: > > >> OK, maybe /doc or src/tools. A more radical approach would be to put > > >> the list in our documentation, or have initdb install it. > > > > > Why not put it in techdocs or some such? > > > > I think we've learned by now that putting copies of other peoples' code > > in our tree isn't such a hot idea; what is going to cause it to be > > updated when things change? How do you know the values are even the > > same across all the Windows versions we support? > > > > Basically this whole idea is misconceived. Just print the number and > > have done. > > And how do people interpret that number? Ah, I found something: http://support.microsoft.com/kb/259693 Someone on IRC says that is kernel mode only, and is looking for a user-mode version, so we would be able to print out a meaningful message rather than a hex value that has to be looked up. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_dump ANALYZE statements
On Mon, Jan 22, 2007 at 03:40:17PM +, Simon Riggs wrote: > This would add a table-specific ANALYZE statement following each table's > actions. It'd probably be best to put it before any index creating activities, since there's a better chance of everything from the table being in shared buffers. Better yet would be if COPY could analyze data as it was loaded in... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Autovacuum Improvements
On Mon, Jan 22, 2007 at 12:17:39PM -0800, Ron Mayer wrote: > Gregory Stark wrote: > > > > Actually no. A while back I did experiments to see how fast reading a file > > sequentially was compared to reading the same file sequentially but skipping > > x% of the blocks randomly. The results were surprising (to me) and > > depressing. > > The breakeven point was about 7%. [...] > > > > The theory online was that as long as you're reading one page from each disk > > track you're going to pay the same seek overhead as reading the entire > > track. > > Could one take advantage of this observation in designing the DSM? > > Instead of a separate bit representing every page, having each bit > represent 20 or so pages might be a more useful unit. It sounds > like the time spent reading would be similar; while the bitmap > would be significantly smaller. If we extended relations by more than one page at a time we'd probably have a better shot at the blocks on disk being contiguous and all read at the same time by the OS. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Piggybacking vacuum I/O
On Mon, Jan 22, 2007 at 02:51:47PM +, Heikki Linnakangas wrote: > I've been looking at the way we do vacuums. > > The fundamental performance issue is that a vacuum generates > nheapblocks+nindexblocks+ndirtyblocks I/Os. Vacuum cost delay helps to > spread the cost like part payment, but the total is the same. In an I/O > bound system, the extra I/O directly leads to less throughput. > > Therefore, we need to do less I/O. Dead space map helps by allowing us > to skip blocks that don't need vacuuming, reducing the # of I/Os to > 2*ndirtyblocks+nindexblocks. That's great, but it doesn't help us if the > dead tuples are spread uniformly. > > If we could piggyback the vacuum I/Os to the I/Os that we're doing > anyway, vacuum wouldn't ideally have to issue any I/O of its own. I've > tried to figure out a way to do that. > > Vacuum is done in 3 phases: > > 1. Scan heap > 2. Vacuum index > 3. Vacuum heap > Instead of doing a sequential scan, we could perform the 1st phase by > watching the buffer pool, scanning blocks for dead tuples when they're > in memory and keeping track of which pages we've seen. When all pages > have been seen, the tid list is sorted and 1st phase is done. > > In theory, the index vacuum could also be done that way, but let's > assume for now that indexes would be scanned like they are currently. > > The 3rd phase can be performed similarly to the 1st phase. Whenever a > page enters the buffer pool, we check the tid list and remove any > matching tuples from the page. When the list is empty, vacuum is complete. Is there any real reason to demark the start and end of a vacuum? Why not just go to a continuous process? One possibility is to keep a list of TIDs for each phase, though that could prove tricky with multiple indexes. > A variation of the scheme would be to keep scanning pages that are in > cache, until the tid list reaches a predefined size, instead of keeping > track of which pages have already been seen. That would deal better with > tables with hot and cold spots, but it couldn't advance the relfrozenid > because there would be no guarantee that all pages are visited. Also, we > could start 1st phase of the next vacuum, while we're still in the 3rd > phase of previous one. What if we tracked freeze status on a per-page basis? Perhaps track the minimum XID that's on each page. That would allow us to ensure that we freeze pages that are approaching XID wrap. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] send password to pg_dump
shaunc wrote: Hello, I'm trying to run pg_dump programmatically, and I'm wondering how I can send it a password. I'm running it with a system() call in a child process, and sending the password in from the parent process on stdin, but somehow pg_dump always finds out how to ask my terminal for a password. How do I fool it? (I need this to work in linux... in python would be preferable too.) 1. this is the wrong forum for this question - here we discuss postgresql developments, not usage 2. use a pgpass file, possibly with a PGPASSFILE environment setting as well - see the docs for details. This works on all platforms and should work with anything at all calling pg_dump. 3. In general, the only way to "fool" programs properly this way is to use expect or some equivalent that uses ptys. In this case, it would be overkill. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 10 weeks to feature freeze (Pending Work)
On Mon, 22 Jan 2007, Joshua D. Drake wrote: Or so... :) Thought I would do a poll of what is happening in the world for 8.3. I have: Alvaro Herrera: Autovacuum improvements (maintenance window etc..) Gavin Sherry: Bitmap Indexes (on disk), possible basic Window functions Jonah Harris: WITH/Recursive Queries? Andrei Kovalesvki: Some Win32 work with Magnus Magnus Hagander: VC++ support (thank goodness) Heikki Linnakangas: Working on Vacuum for Bitmap Indexes? Oleg Bartunov: Tsearch2 in core Teodor Sigaev should be here ! Neil Conway: Patch Review (including enums), pg_fcache Vertical projects: Pavel Stehule: PLpsm Alexey Klyukin: PLphp Andrei Kovalesvki: ODBCng I am sure there are more, the ones with question marks are unknowns but heard of in the ether somewhere. Any additions or confirmations? Sincerely, Joshua D. Drake Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] send password to pg_dump
Hello, I'm trying to run pg_dump programmatically, and I'm wondering how I can send it a password. I'm running it with a system() call in a child process, and sending the password in from the parent process on stdin, but somehow pg_dump always finds out how to ask my terminal for a password. How do I fool it? (I need this to work in linux... in python would be preferable too.) Thanks for any advice. - Shaun -- View this message in context: http://www.nabble.com/send-password-to-pg_dump-tf3048210.html#a8473174 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [BUGS] BUG #2907: pg_get_serial_sequence quoting
Tom Lane wrote: Bruce Momjian <[EMAIL PROTECTED]> writes: I presume the reason for that is that the first paramater can be qualified: select pg_get_serial_sequence('"public"."FOO"', 'Ff1'); Would someone explain why qualification makes us lowercase the first parameter by default? I don't understand it well enough to document it. The point is that we have to parse the first parameter, whereas the second one can be taken literally. It still looks inconsistent and ugly. I think the design mistake of pg_get_serial_sequence is that it takes two parameters rather than one (a fully qualified doublequoted columnname path) or three (optionally empty schema, tablename, columnname, all three literal). Regards, Adriaan van Os ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] Autovacuum Improvements
On Mon, Jan 22, 2007 at 07:24:20PM +, Heikki Linnakangas wrote: > Kenneth Marshall wrote: > >On Mon, Jan 22, 2007 at 06:42:09PM +, Simon Riggs wrote: > >>Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread... > > > >There may be other functions that could leverage a similar sort of > >infrastructure. For example, a long DB mining query could be registered > >with the system. Then as the pieces of the table/database are brought in > >to shared memory during the normal daily DB activity they can be acquired > >without forcing the DB to run a very I/O expensive query when waiting a > >bit for the results would be acceptable. As long as we are thinking > >piggyback. > > Yeah, I had the same idea when we discussed synchronizing sequential > scans. The biggest difference is that with queries, there's often a user > waiting for the query to finish, but with vacuum we don't care so much > how long it takes. > Yes, but with trending and statistical analysis you may not need the exact answer ASAP. An approximate answer based on a fraction of the information would be useful. Also, "what if" queries could be run without impacting the production uses of a database. One might imagine having a query with results that "converge" as the table is processed during normal use. Ken ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Piggybacking vacuum I/O
On Mon, 2007-01-22 at 13:41 +, Heikki Linnakangas wrote: > Any thoughts before I start experimenting? Probably only to detail the various use cases we are discussing. My thoughts on various use cases are: - small table with frequent update/delete, heap and indexes all/mostly cached e.g. Counter tables, DBT2: District/Warehouse TPC-C, pgbench: Branches/Tellers Current VACUUM works well in this situation, since the only I/O incurred is the WAL written for the VACUUM. VACUUM very cheap even if not in cache because of sequential I/O. Keeping track of whether there are hot spots in these tables seems like a waste of cycles and could potentially introduce contention and hence reduce performance. These need to be very frequently VACUUMed, even when other VACUUMs are required. My current view: just need multiple concurrent autovacuum processes. - large table with severe hotspots e.g. DBT2: NewOrder, larger queue-style tables The hotspots are likely to be in cache and the not-so-hotspots might or might not be in cache, but we don't care either way. DSM concept works well for this case, since we are able to avoid lots of I/O by appropriate book-keeping. Works well for removing rows after a file-scan DELETE, as well as for DELETE or UPDATE hot spots. My current view: DSM would be great for this - large table with few hotspots e.g. DBT2: Stock, pgbench: Accounts, most Customer tables Current VACUUM works very badly in this case, since updates are sparsely distributed across table. DSM wouldn't help either unless we differentiate between few/many updates to a block. My current view: Piggyback concept seems on the right track, but clearly needs further thought. Currently we have only one technique for garbage collection, plus one process to perform it. We need multiple techniques executed by multiple processes, when required, plus some way of automatically selecting which is appropriate depending upon the use case. Yes, automatic :-) DSM and this piggyback idea need not be thought of as competing techniques. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Autovacuum Improvements
On Mon, Jan 22, 2007 at 06:42:09PM +, Simon Riggs wrote: > On Mon, 2007-01-22 at 13:27 -0500, Bruce Momjian wrote: > > Yep, agreed on the random I/O issue. The larger question is if you have > > a huge table, do you care to reclaim 3% of the table size, rather than > > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > > to take a lot of time, but vacuuming to relaim 3% three times seems like > > it is going to be more expensive than just vacuuming the 10% once. And > > vacuuming to reclaim 1% ten times seems even more expensive. The > > partial vacuum idea is starting to look like a loser to me again. > > Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread... > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > There may be other functions that could leverage a similar sort of infrastructure. For example, a long DB mining query could be registered with the system. Then as the pieces of the table/database are brought in to shared memory during the normal daily DB activity they can be acquired without forcing the DB to run a very I/O expensive query when waiting a bit for the results would be acceptable. As long as we are thinking piggyback. Ken ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 10 weeks to feature freeze (Pending Work)
On 1/22/07, Joshua D. Drake <[EMAIL PROTECTED]> wrote: Jonah Harris: WITH/Recursive Queries? Yup, just talked with Bruce about this last week. Working on the design now. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] 10 weeks to feature freeze (Pending Work)
Or so... :) Thought I would do a poll of what is happening in the world for 8.3. I have: Alvaro Herrera: Autovacuum improvements (maintenance window etc..) Gavin Sherry: Bitmap Indexes (on disk), possible basic Window functions Jonah Harris: WITH/Recursive Queries? Andrei Kovalesvki: Some Win32 work with Magnus Magnus Hagander: VC++ support (thank goodness) Heikki Linnakangas: Working on Vacuum for Bitmap Indexes? Oleg Bartunov: Tsearch2 in core Neil Conway: Patch Review (including enums), pg_fcache Vertical projects: Pavel Stehule: PLpsm Alexey Klyukin: PLphp Andrei Kovalesvki: ODBCng I am sure there are more, the ones with question marks are unknowns but heard of in the ether somewhere. Any additions or confirmations? Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] autovacuum process handling
Alvaro Herrera wrote: This is how I think autovacuum should change with an eye towards being able to run multiple vacuums simultaneously: [snip details] Does this raise some red flags? It seems straightforward enough to me; I'll submit a patch implementing this, so that scheduling will continue to be as it is today. Thus the scheduling discussions are being deferred until they can be actually useful and implementable. I can't really speak to the PostgreSQL signaling innards, but this sound logical to me. I think having the worker processes be children of the postmaster and having them be single-minded (or single-tasked) also makes a lot of sense. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Autovacuum Improvements
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Gregory Stark >Sent: maandag 22 januari 2007 19:41 >To: Bruce Momjian >Cc: Heikki Linnakangas; Russell Smith; Darcy Buskermolen; >Simon Riggs; Alvaro Herrera; Matthew T. O'Connor; Pavan >Deolasee; Christopher Browne; pgsql-general@postgresql.org; >pgsql-hackers@postgresql.org >Subject: Re: [HACKERS] [GENERAL] Autovacuum Improvements > >"Bruce Momjian" <[EMAIL PROTECTED]> writes: > >> Yep, agreed on the random I/O issue. The larger question is if you >> have a huge table, do you care to reclaim 3% of the table >size, rather >> than just vacuum it when it gets to 10% dirty? I realize the vacuum >> is going to take a lot of time, but vacuuming to relaim 3% >three times >> seems like it is going to be more expensive than just vacuuming the >> 10% once. And vacuuming to reclaim 1% ten times seems even more >> expensive. The partial vacuum idea is starting to look like >a loser to me again. > >Well the answer is of course "that depends". > >If you maintain the dead space at a steady state averaging >1.5% instead of 5% your table is 3.33% smaller on average. If >this is a DSS system that will translate into running your >queries 3.33% faster. It will take a lot of vacuums before >they hurt more than a 3%+ performance drop. Good, this means a DSS system will mostly do table scans (right?). So probably you should witness the 'table scan' statistic and rows fetched aproaching the end of the universe (at least compared to inserts/updates/deletes)? >If it's an OLTP system the it's harder to figure. a 3.33% >increase in data density will translate to a higher cache hit >rate but how much higher depends on a lot of factors. In our >experiments we actually got bigger boost in these kinds of >situations than the I expected (I expected comparable to the >3.33% improvement). So it could be even more than 3.33%. But >like said it depends. >If you already have the whole database cache you won't see any >improvement. If you are right on the cusp you could see a huge benefit. These tables have high insert, update and delete rates, probably a lot of index scans? I believe the workload on table scans should be (close to) none. Are you willing to share some of this measured data? I'm quite interested in such figures. >It sounds like you're underestimating the performance drain >10% wasted space has. If we found out that one routine was >unnecessarily taking 10% of the cpu time it would be an >obvious focus of attention. 10% wasted space is going to work >out to about 10% of the i/o time. > >It also sounds like we're still focused on the performance >impact in absolute terms. I'm much more interested in changing >the performance characteristics so they're predictable and >scalable. It doesn't matter much if your 1kb table is 100% >slower than necessary but it does matter if your 1TB table >needs 1,000x as much vacuuming as your 1GB table even if it's >getting the same update traffic. Or rather, the vacuuming should pay back. A nice metric might be: cost_of_not_vacuuming / cost_of_vacuuming. Obviously, the higher the better. - Joris Dobbelsteen ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Gregory Stark wrote: > > Actually no. A while back I did experiments to see how fast reading a file > sequentially was compared to reading the same file sequentially but skipping > x% of the blocks randomly. The results were surprising (to me) and depressing. > The breakeven point was about 7%. [...] > > The theory online was that as long as you're reading one page from each disk > track you're going to pay the same seek overhead as reading the entire track. Could one take advantage of this observation in designing the DSM? Instead of a separate bit representing every page, having each bit represent 20 or so pages might be a more useful unit. It sounds like the time spent reading would be similar; while the bitmap would be significantly smaller. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Alvaro Herrera wrote: > Bruce Momjian wrote: > > > > Yep, agreed on the random I/O issue. The larger question is if you have > > a huge table, do you care to reclaim 3% of the table size, rather than > > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > > to take a lot of time, but vacuuming to relaim 3% three times seems like > > it is going to be more expensive than just vacuuming the 10% once. And > > vacuuming to reclaim 1% ten times seems even more expensive. The > > partial vacuum idea is starting to look like a loser to me again. > > But if the partial vacuum is able to clean the busiest pages and reclaim > useful space, currently-running transactions will be able to use that > space and thus not have to extend the table. Not that extension is a > problem on itself, but it'll keep your working set smaller. Yes, but my point is that if you are trying to avoid vacuuming the table, I am afraid the full index scan is going to be painful too. I can see corner cases where partial vacuum is a win (I only have 4 hours of idle I/O), but for the general case I am still worried that partial vacuum will not be that useful as long as we have to scan the indexes. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Kenneth Marshall wrote: On Mon, Jan 22, 2007 at 06:42:09PM +, Simon Riggs wrote: Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread... There may be other functions that could leverage a similar sort of infrastructure. For example, a long DB mining query could be registered with the system. Then as the pieces of the table/database are brought in to shared memory during the normal daily DB activity they can be acquired without forcing the DB to run a very I/O expensive query when waiting a bit for the results would be acceptable. As long as we are thinking piggyback. Yeah, I had the same idea when we discussed synchronizing sequential scans. The biggest difference is that with queries, there's often a user waiting for the query to finish, but with vacuum we don't care so much how long it takes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] autovacuum process handling
Hi, This is how I think autovacuum should change with an eye towards being able to run multiple vacuums simultaneously: 1. There will be two kinds of processes, "autovacuum launcher" and "autovacuum worker". 2. The launcher will be in charge of scheduling and will tell workers what to do 3. The workers will be similar to what autovacuum does today: start when somebody else tells it to start, process a single item (be it a table or a database) and terminate 4. Launcher will be a continuously-running process, akin to bgwriter; connected to shared memory 5. Workers will be direct postmaster children; so postmaster will get SIGCHLD when worker dies 6. Launcher will start a worker using the following protocol: - Set up information on what to run on shared memory - invoke SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER) - Postmaster will react by starting a worker, and registering it very similarly to a regular backend, so it can be shut down easily when appropriate. (Thus launcher will not be informed right away when worker dies) - Worker will examine shared memory to know what to do, clear the request, and send a signal to Launcher - Launcher wakes up and can start another one if appropriate Does this raise some red flags? It seems straightforward enough to me; I'll submit a patch implementing this, so that scheduling will continue to be as it is today. Thus the scheduling discussions are being deferred until they can be actually useful and implementable. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] savepoint improvements
On 1/22/07, Alvaro Herrera <[EMAIL PROTECTED]> wrote: > i think so...Martijn said it best: you can 'rollback' to, but you > can't 'commit' to. The 'commit to' would be the arguably much more > useful way of disposing of a savepoint. But that should be taken up > with sql standards committee :(. You can RELEASE a savepoint though. not following an error. RELEASE serves absolutely no purpose whatsoever. it's like the sql equivalent of an assembly NOP...wasts cpu cycles for no reason. merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Gregory Stark wrote: "Bruce Momjian" <[EMAIL PROTECTED]> writes: I agree it index cleanup isn't > 50% of vacuum. I was trying to figure out how small, and it seems about 15% of the total table, which means if we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps 80%, assuming 5% of the table is scanned. Actually no. A while back I did experiments to see how fast reading a file sequentially was compared to reading the same file sequentially but skipping x% of the blocks randomly. The results were surprising (to me) and depressing. The breakeven point was about 7%. Note that with uniformly random updates, you have dirtied every page of the table until you get anywhere near 5% of dead space. So we have to assume non-uniform distribution of update for the DSM to be of any help. And if we assume non-uniform distribution, it's a good bet that the blocks that need vacuuming are also not randomly distributed. In fact, they might very well all be in one cluster, so that scanning that cluster is indeed sequential I/O. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Bruce Momjian wrote: Yep, agreed on the random I/O issue. The larger question is if you have a huge table, do you care to reclaim 3% of the table size, rather than just vacuum it when it gets to 10% dirty? I realize the vacuum is going to take a lot of time, but vacuuming to relaim 3% three times seems like it is going to be more expensive than just vacuuming the 10% once. And vacuuming to reclaim 1% ten times seems even more expensive. The partial vacuum idea is starting to look like a loser to me again. Buying a house with a 25-year mortgage is much more expensive than just paying cash too, but you don't always have a choice. Surely the key benefit of the partial vacuuming thing is that you can at least do something useful with a large table if a full vacuum takes 24 hours and you only have 4 hours of idle I/O. It's also occurred to me that all the discussion of scheduling way back when isn't directly addressing the issue. What most people want (I'm guessing) is to vacuum *when the user-workload allows* and the time-tabling is just a sysadmin first-approximation at that. With partial vacuuming possible, we can arrange things with just three thresholds and two measurements: Measurement 1 = system workload Measurement 2 = a per-table "requires vacuuming" value Threshold 1 = workload at which we do more vacuuming Threshold 2 = workload at which we do less vacuuming Threshold 3 = point at which a table is considered worth vacuuming. Once every 10 seconds, the manager compares the current workload to the thresholds and starts a new vacuum, kills one or does nothing. New vacuum processes keep getting started as long as there is workload spare and tables that need vacuuming. Now the trick of course is how you measure system workload in a meaningful manner. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] savepoint improvements
Merlin Moncure wrote: > On 1/22/07, Dawid Kuroczko <[EMAIL PROTECTED]> wrote: > >one for each INSERT+UPDATE block. This way eiher both of them succeed > >or fail, within one transaction. > > i think so...Martijn said it best: you can 'rollback' to, but you > can't 'commit' to. The 'commit to' would be the arguably much more > useful way of disposing of a savepoint. But that should be taken up > with sql standards committee :(. You can RELEASE a savepoint though. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Bruce Momjian wrote: > > Yep, agreed on the random I/O issue. The larger question is if you have > a huge table, do you care to reclaim 3% of the table size, rather than > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > to take a lot of time, but vacuuming to relaim 3% three times seems like > it is going to be more expensive than just vacuuming the 10% once. And > vacuuming to reclaim 1% ten times seems even more expensive. The > partial vacuum idea is starting to look like a loser to me again. But if the partial vacuum is able to clean the busiest pages and reclaim useful space, currently-running transactions will be able to use that space and thus not have to extend the table. Not that extension is a problem on itself, but it'll keep your working set smaller. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] savepoint improvements
On 1/22/07, Dawid Kuroczko <[EMAIL PROTECTED]> wrote: On 1/22/07, Tom Lane <[EMAIL PROTECTED]> wrote: > > The short version is I would like the ability to run some sql commands > > and recover the transaction if an error occurs. > > I'm getting tired of repeating this, but: neither of you have said > anything that doesn't appear to me to be handled by ON_ERROR_ROLLBACK. > What exactly is lacking in that feature? I think the problem is with doing something like this: BEGIN; INSERT INTO foo VALUES ('1'); UPDATE status SET updated=now() WHERE tab='foo'; INSERT INTO bar VALUES ('2'); UPDATE status SET updated=now() WHERE tab='bar'; INSERT INTO baz VALUES ('3'); UPDATE status SET updated=now() WHERE tab='baz'; COMMIT; This will issue three savepoints (if I understand how things wok correctly), yes one for each INSERT+UPDATE block. This way eiher both of them succeed or fail, within one transaction. i think so...Martijn said it best: you can 'rollback' to, but you can't 'commit' to. The 'commit to' would be the arguably much more useful way of disposing of a savepoint. But that should be taken up with sql standards committee :(. One solution would be a psql command which would fire given command on error condition, like: yes, psql can handle this. while (IMO) a hack, it addresses the shortcoming (scripting) specifically not handled by savepoints.. merlin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] savepoint improvements
On 1/22/07, Martijn van Oosterhout wrote: I don't understand this either. Everything you can do with nested transactions you can also do with savepoints, so I'm really not understand what the limitations are? Actually, looking at the savepoint documentation, it looks like there is no way to say: if transaction_state ok then release X else rollback to X exactly. Which is what a normal COMMIT does (sort of). This is very irritating for scripting, so maybe a "COMMIT TO X" command would be auseful addition? right. thats exactly what I want (more or less, there are a couple of different ways to do it, but this is perfectly acceptable). The on errors bit was just a froofy addition that distracted from the core problem. merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] Autovacuum Improvements
On Mon, 2007-01-22 at 13:27 -0500, Bruce Momjian wrote: > Yep, agreed on the random I/O issue. The larger question is if you have > a huge table, do you care to reclaim 3% of the table size, rather than > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > to take a lot of time, but vacuuming to relaim 3% three times seems like > it is going to be more expensive than just vacuuming the 10% once. And > vacuuming to reclaim 1% ten times seems even more expensive. The > partial vacuum idea is starting to look like a loser to me again. Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Autovacuum Improvements
"Bruce Momjian" <[EMAIL PROTECTED]> writes: > Yep, agreed on the random I/O issue. The larger question is if you have > a huge table, do you care to reclaim 3% of the table size, rather than > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > to take a lot of time, but vacuuming to relaim 3% three times seems like > it is going to be more expensive than just vacuuming the 10% once. And > vacuuming to reclaim 1% ten times seems even more expensive. The > partial vacuum idea is starting to look like a loser to me again. Well the answer is of course "that depends". If you maintain the dead space at a steady state averaging 1.5% instead of 5% your table is 3.33% smaller on average. If this is a DSS system that will translate into running your queries 3.33% faster. It will take a lot of vacuums before they hurt more than a 3%+ performance drop. If it's an OLTP system the it's harder to figure. a 3.33% increase in data density will translate to a higher cache hit rate but how much higher depends on a lot of factors. In our experiments we actually got bigger boost in these kinds of situations than the I expected (I expected comparable to the 3.33% improvement). So it could be even more than 3.33%. But like said it depends. If you already have the whole database cache you won't see any improvement. If you are right on the cusp you could see a huge benefit. It sounds like you're underestimating the performance drain 10% wasted space has. If we found out that one routine was unnecessarily taking 10% of the cpu time it would be an obvious focus of attention. 10% wasted space is going to work out to about 10% of the i/o time. It also sounds like we're still focused on the performance impact in absolute terms. I'm much more interested in changing the performance characteristics so they're predictable and scalable. It doesn't matter much if your 1kb table is 100% slower than necessary but it does matter if your 1TB table needs 1,000x as much vacuuming as your 1GB table even if it's getting the same update traffic. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] savepoint improvements
On 1/22/07, Tom Lane <[EMAIL PROTECTED]> wrote: > The short version is I would like the ability to run some sql commands > and recover the transaction if an error occurs. I'm getting tired of repeating this, but: neither of you have said anything that doesn't appear to me to be handled by ON_ERROR_ROLLBACK. What exactly is lacking in that feature? I think the problem is with doing something like this: BEGIN; INSERT INTO foo VALUES ('1'); UPDATE status SET updated=now() WHERE tab='foo'; INSERT INTO bar VALUES ('2'); UPDATE status SET updated=now() WHERE tab='bar'; INSERT INTO baz VALUES ('3'); UPDATE status SET updated=now() WHERE tab='baz'; COMMIT; This will issue three savepoints (if I understand how things wok correctly), one for each INSERT+UPDATE block. This way eiher both of them succeed or fail, within one transaction. Now, I think the problem the OP wanted to solve was that keeping command on one line just to have them "inside" one savepoint, and depending on psql(1) to issue rollbacks for us. I think OPs idea was to be able to rollback if error occured: BEGIN; SAVEPOINT s1; INSERT... UPDATE... ROLLBACK TO s1 ON ERROR; INSERT.. UPDATE... ROLLBACK TO s2 ON ERROR; UPDATE job SET ts = now(); -- OK COMMIT; -- notice lack of rollback -- whole transaction will fail on error One solution would be a psql command which would fire given command on error condition, like: BEGIN; SAVEPOINT s1; INSERT... UPDATE... \on_error ROLLBACK TO s1; INSERT INTO errors SAVEPOINT s2; COMMIT; Regards, Dawid ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] savepoint improvements
On Mon, Jan 22, 2007 at 11:21:12AM -0500, Merlin Moncure wrote: > >\begin_nest > >\commit_nest > >\rollback_nest > > That would work if we could rollback conditionally on failure (like > on_error_rollback but with definable beginning and ending points). I Sorry, "rollback conditionally on failure" isn't parsing for me. Can you give some example of what you mean? > still think we are hacking around limitations of savepoints but it > would solve the scripting problem at least. A general implementation > on the server would benefit everybody. I don't understand this either. Everything you can do with nested transactions you can also do with savepoints, so I'm really not understand what the limitations are? Actually, looking at the savepoint documentation, it looks like there is no way to say: if transaction_state ok then release X else rollback to X Which is what a normal COMMIT does (sort of). This is very irritating for scripting, so maybe a "COMMIT TO X" command would be auseful addition? Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Yep, agreed on the random I/O issue. The larger question is if you have a huge table, do you care to reclaim 3% of the table size, rather than just vacuum it when it gets to 10% dirty? I realize the vacuum is going to take a lot of time, but vacuuming to relaim 3% three times seems like it is going to be more expensive than just vacuuming the 10% once. And vacuuming to reclaim 1% ten times seems even more expensive. The partial vacuum idea is starting to look like a loser to me again. --- Gregory Stark wrote: > "Bruce Momjian" <[EMAIL PROTECTED]> writes: > > > I agree it index cleanup isn't > 50% of vacuum. I was trying to figure > > out how small, and it seems about 15% of the total table, which means if > > we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps > > 80%, assuming 5% of the table is scanned. > > Actually no. A while back I did experiments to see how fast reading a file > sequentially was compared to reading the same file sequentially but skipping > x% of the blocks randomly. The results were surprising (to me) and depressing. > The breakeven point was about 7%. > > That is, if you assum that only 5% of the table will be scanned and you > arrange to do it sequentially then you should expect the i/o to be marginally > faster than just reading the entire table. Vacuum does do some cpu work and > wouldn't have to consult the clog as often, so it would still be somewhat > faster. > > The theory online was that as long as you're reading one page from each disk > track you're going to pay the same seek overhead as reading the entire track. > I also had some theories involving linux being confused by the seeks and > turning off read-ahead but I could never prove them. > > In short, to see big benefits you would have to have a much smaller percentage > of the table being read. That shouldn't be taken to mean that the DSM is a > loser. There are plenty of use cases where tables can be extremely large and > have only very small percentages that are busy. The big advantage of the DSM > is that it takes the size of the table out of the equation and replaces it > with the size of the busy portion of the table. So updating a single record in > a terabyte table has the same costs as updating a single record in a kilobyte > table. > > Sadly that's not quite true due to indexes, and due to the size of the bitmap > itself. But going back to your numbers it does mean that if you update a > single row out of a terabyte table then we'll be removing about 85% of the i/o > (minus the i/o needed to read the DSM, about .025%). If you update about 1% > then you would be removing substantially less, and once you get to about 10% > then you're back where you started. > > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Autovacuum Improvements
"Bruce Momjian" <[EMAIL PROTECTED]> writes: > I agree it index cleanup isn't > 50% of vacuum. I was trying to figure > out how small, and it seems about 15% of the total table, which means if > we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps > 80%, assuming 5% of the table is scanned. Actually no. A while back I did experiments to see how fast reading a file sequentially was compared to reading the same file sequentially but skipping x% of the blocks randomly. The results were surprising (to me) and depressing. The breakeven point was about 7%. That is, if you assum that only 5% of the table will be scanned and you arrange to do it sequentially then you should expect the i/o to be marginally faster than just reading the entire table. Vacuum does do some cpu work and wouldn't have to consult the clog as often, so it would still be somewhat faster. The theory online was that as long as you're reading one page from each disk track you're going to pay the same seek overhead as reading the entire track. I also had some theories involving linux being confused by the seeks and turning off read-ahead but I could never prove them. In short, to see big benefits you would have to have a much smaller percentage of the table being read. That shouldn't be taken to mean that the DSM is a loser. There are plenty of use cases where tables can be extremely large and have only very small percentages that are busy. The big advantage of the DSM is that it takes the size of the table out of the equation and replaces it with the size of the busy portion of the table. So updating a single record in a terabyte table has the same costs as updating a single record in a kilobyte table. Sadly that's not quite true due to indexes, and due to the size of the bitmap itself. But going back to your numbers it does mean that if you update a single row out of a terabyte table then we'll be removing about 85% of the i/o (minus the i/o needed to read the DSM, about .025%). If you update about 1% then you would be removing substantially less, and once you get to about 10% then you're back where you started. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] Autovacuum Improvements
On Mon, 2007-01-22 at 12:18 -0500, Bruce Momjian wrote: > Heikki Linnakangas wrote: > > > > In any case, for the statement "Index cleanup is the most expensive part > > of vacuum" to be true, you're indexes would have to take up 2x as much > > space as the heap, since the heap is scanned twice. I'm sure there's > > databases like that out there, but I don't think it's the common case. > > I agree it index cleanup isn't > 50% of vacuum. I was trying to figure > out how small, and it seems about 15% of the total table, which means if > we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps > 80%, assuming 5% of the table is scanned. Clearly keeping track of what needs vacuuming will lead to a more efficient VACUUM. Your math applies to *any* design that uses some form of book-keeping to focus in on the hot spots. On a separate thread, Heikki has raised a different idea for VACUUM. Heikki's idea asks an important question: where and how should DSM information be maintained? Up to now everybody has assumed that it would be maintained when DML took place and that the DSM would be a transactional data structure (i.e. on-disk). Heikki's idea requires similar bookkeeping requirements to the original DSM concept, but the interesting aspect is that the DSM information is collected off-line, rather than being an overhead on every statement's response time. That idea seems extremely valuable to me. One of the main challenges is how we cope with large tables that have a very fine spray of updates against them. A DSM bitmap won't help with that situation, regrettably. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Heikki Linnakangas wrote: > Bruce Momjian wrote: > > Heikki Linnakangas wrote: > >> Russell Smith wrote: > >>> 2. Index cleanup is the most expensive part of vacuum. So doing a > >>> partial vacuum actually means more I/O as you have to do index cleanup > >>> more often. > >> I don't think that's usually the case. Index(es) are typically only a > >> fraction of the size of the table, and since 8.2 we do index vacuums in > >> a single scan in physical order. In fact, in many applications the index > >> is be mostly cached and the index scan doesn't generate any I/O at all. > > > > Are _all_ the indexes cached? I would doubt that. > > Well, depends on your schema, of course. In many applications, yes. > > > Also, for typical > > table, what percentage is the size of all indexes combined? > > Well, there's no such thing as a typical table. As an anecdote here's > the ratios (total size of all indexes of a table)/(size of corresponding > heap) for the bigger tables for a DBT-2 run I have at hand: > > Stock:1190470/68550 = 6% > Order_line: 950103/274372 = 29% > Customer: 629011 /(5711+20567) = 8% > > In any case, for the statement "Index cleanup is the most expensive part > of vacuum" to be true, you're indexes would have to take up 2x as much > space as the heap, since the heap is scanned twice. I'm sure there's > databases like that out there, but I don't think it's the common case. I agree it index cleanup isn't > 50% of vacuum. I was trying to figure out how small, and it seems about 15% of the total table, which means if we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps 80%, assuming 5% of the table is scanned. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] savepoint improvements
On 1/22/07, Tom Lane <[EMAIL PROTECTED]> wrote: I'm getting tired of repeating this, but: neither of you have said anything that doesn't appear to me to be handled by ON_ERROR_ROLLBACK. What exactly is lacking in that feature? * the ability to span the savepoint across multiple statements. * the ability to get what you want without wastefully creating a savepoint before every statement. * losing some behavior which (IMO) is general and beneficial. how do psql tricks help proper stored procedures should we aver get them? That being said, some simple extensions to the psql rollback feature would get the job done I guess. I'm still not happy with it but I knew it was a tough go from the beginning...I appreciate everyone's comments. merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] savepoint improvements
On Mon, 2007-01-22 at 16:11 +, Gregory Stark wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > > BEGIN; > > stmt1; > > stmt2; <-- error > > stmt3; > > COMMIT; > > > > results in stmt3 completing successfully even though stmt1 and stmt2 do > > not == broken script. > > stmt1 would still be completed successfully. OK, understood. ON_ERROR_ROLLBACK is what we need, for psql only. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] savepoint improvements
On 1/22/07, Martijn van Oosterhout wrote: we're talking about psql, maybe all we need to do is create the commands in psql: \begin_nest \commit_nest \rollback_nest That would work if we could rollback conditionally on failure (like on_error_rollback but with definable beginning and ending points). I still think we are hacking around limitations of savepoints but it would solve the scripting problem at least. A general implementation on the server would benefit everybody. merlin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_dump ANALYZE statements
On Mon, 2007-01-22 at 10:49 -0500, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > ISTM that pg_dump should default to have ANALYZE statements for a table, > > when both table definition and data are dumped. If only data (-a) or > > only table definition (-s) is dumped the default should be to *not* add > > an ANALYZE statement. > > Having pg_dump emit ANALYZE was discussed and rejected years ago. > Have you read that discussion? Do you have any new arguments to make? Well, its been suggested before, but I can't see any good arguments against. Specifically, there was one person who spoke in favour of it last time it was mentioned. http://archives.postgresql.org/pgsql-hackers/2003-02/msg01270.php If its a performance tip, we should be doing it automatically. If we genuinely believe that autovacuum will handle it, then we can simply alter the docs to say: if you aren't running autovacuum, then don't forget to ANALYZE. IMHO it is not sufficient to rely upon autovacuum to do all of this work for us. Rod mentions the discussion has been raised before but doesn't state what the arguments were: http://archives.postgresql.org/pgsql-hackers/2003-02/msg01264.php These other posts also seem to be in favour of the idea... http://archives.postgresql.org/pgsql-performance/2006-10/msg00142.php http://archives.postgresql.org/pgsql-hackers/2003-02/msg01273.php -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] savepoint improvements
"Simon Riggs" <[EMAIL PROTECTED]> writes: > BEGIN; > stmt1; > stmt2; <-- error > stmt3; > COMMIT; > > results in stmt3 completing successfully even though stmt1 and stmt2 do > not == broken script. stmt1 would still be completed successfully. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Bruce Momjian wrote: Heikki Linnakangas wrote: Russell Smith wrote: 2. Index cleanup is the most expensive part of vacuum. So doing a partial vacuum actually means more I/O as you have to do index cleanup more often. I don't think that's usually the case. Index(es) are typically only a fraction of the size of the table, and since 8.2 we do index vacuums in a single scan in physical order. In fact, in many applications the index is be mostly cached and the index scan doesn't generate any I/O at all. Are _all_ the indexes cached? I would doubt that. Well, depends on your schema, of course. In many applications, yes. Also, for typical table, what percentage is the size of all indexes combined? Well, there's no such thing as a typical table. As an anecdote here's the ratios (total size of all indexes of a table)/(size of corresponding heap) for the bigger tables for a DBT-2 run I have at hand: Stock: 1190470/68550 = 6% Order_line: 950103/274372 = 29% Customer: 629011 /(5711+20567) = 8% In any case, for the statement "Index cleanup is the most expensive part of vacuum" to be true, you're indexes would have to take up 2x as much space as the heap, since the heap is scanned twice. I'm sure there's databases like that out there, but I don't think it's the common case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] savepoint improvements
On Mon, 2007-01-22 at 10:46 -0500, Tom Lane wrote: > "Merlin Moncure" <[EMAIL PROTECTED]> writes: > > On 1/22/07, Simon Riggs <[EMAIL PROTECTED]> wrote: > >> Could you post an example, just so we're all clear what the problems > >> are? I thought I understood what you are requesting; I may not. > > > ok, > > > The short version is I would like the ability to run some sql commands > > and recover the transaction if an error occurs. > > I'm getting tired of repeating this, but: neither of you have said > anything that doesn't appear to me to be handled by ON_ERROR_ROLLBACK. > What exactly is lacking in that feature? Sorry for not replying to your other post. ON_ERROR_ROLLBACK doesn't do the same thing, thats why. It shuts out the noise messages, true, but it doesn't re-execute all of the commands in the transaction that succeeded and so breaks the transaction, as originally coded. BEGIN; stmt1; stmt2; <-- error stmt3; COMMIT; results in stmt3 completing successfully even though stmt1 and stmt2 do not == broken script. The behaviour we've been discussing is when stmt2 fails, to allow stmt3 to be submitted, so that at commit, stmt1 and stmt3 effects will be successful *if* the user wishes this. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_dump ANALYZE statements
"Simon Riggs" <[EMAIL PROTECTED]> writes: > There doesn't seem to be any reason to skip the ANALYZE, but I'll > implement it as an option. > -z on | off > --analyze=on | off Only an aesthetic comment: Short options don't usually take on/off arguments, I would suggest making the default be to analyze and make -z and --analyze=off disable the analyze. You might also consider having a --analyze=verbose and perhaps a --analyze=full though currently that would require doing vacuum analyze. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] pg_dump ANALYZE statements
Simon Riggs wrote: There is currently a performance tip to run ANALYZE after a pg_dump script has been restored. ISTM that pg_dump should default to have ANALYZE statements for a table, when both table definition and data are dumped. If only data (-a) or only table definition (-s) is dumped the default should be to *not* add an ANALYZE statement. There doesn't seem to be any reason to skip the ANALYZE, but I'll implement it as an option. -z on | off --analyze=on | off This would add a table-specific ANALYZE statement following each table's actions. I'm not aware of a strong argument against such an option. Performance surely can't be one because the time saved on the ANALYZE will quickly bite back on time lost on poorly planned queries. What does the panel think? how is this going to interact with the (now by default enabled) autovacuum daemon ? Stefan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] pg_dump ANALYZE statements
"Simon Riggs" <[EMAIL PROTECTED]> writes: > ISTM that pg_dump should default to have ANALYZE statements for a table, > when both table definition and data are dumped. If only data (-a) or > only table definition (-s) is dumped the default should be to *not* add > an ANALYZE statement. Having pg_dump emit ANALYZE was discussed and rejected years ago. Have you read that discussion? Do you have any new arguments to make? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] savepoint improvements
On Mon, Jan 22, 2007 at 10:40:37AM -0500, Merlin Moncure wrote: > The short version is I would like the ability to run some sql commands > any error updating foo or bar will blow up the whole thing. Maybe > this is desirable, but it is often nice to be able to do some error > handling here. In the pre-savepoint NT implementation I could: Nested transactions are trivially implemented on top of savepoints. If we're talking about psql, maybe all we need to do is create the commands in psql: \begin_nest \commit_nest \rollback_nest Would that suit your purpose? Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] savepoint improvements
"Merlin Moncure" <[EMAIL PROTECTED]> writes: > On 1/22/07, Simon Riggs <[EMAIL PROTECTED]> wrote: >> Could you post an example, just so we're all clear what the problems >> are? I thought I understood what you are requesting; I may not. > ok, > The short version is I would like the ability to run some sql commands > and recover the transaction if an error occurs. I'm getting tired of repeating this, but: neither of you have said anything that doesn't appear to me to be handled by ON_ERROR_ROLLBACK. What exactly is lacking in that feature? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] pg_dump ANALYZE statements
There is currently a performance tip to run ANALYZE after a pg_dump script has been restored. ISTM that pg_dump should default to have ANALYZE statements for a table, when both table definition and data are dumped. If only data (-a) or only table definition (-s) is dumped the default should be to *not* add an ANALYZE statement. There doesn't seem to be any reason to skip the ANALYZE, but I'll implement it as an option. -z on | off --analyze=on | off This would add a table-specific ANALYZE statement following each table's actions. I'm not aware of a strong argument against such an option. Performance surely can't be one because the time saved on the ANALYZE will quickly bite back on time lost on poorly planned queries. What does the panel think? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] savepoint improvements
On 1/22/07, Simon Riggs <[EMAIL PROTECTED]> wrote: Could you post an example, just so we're all clear what the problems are? I thought I understood what you are requesting; I may not. ok, The short version is I would like the ability to run some sql commands and recover the transaction if an error occurs. We have the ability to do this with savepoint...rollback to savepoint...but these are not useful without introducing an external language (c,perl) that can catch the errors and do a rollback to a savepoint conditionally on the sql error state. How would this be useful? Well when I update production systems I often do this from a master script that loads smaller scripts from another place: -- update_production.sql begin; \i update_foo.sql \i update_bar.sql commit; any error updating foo or bar will blow up the whole thing. Maybe this is desirable, but it is often nice to be able to do some error handling here. In the pre-savepoint NT implementation I could: -- update_production.sql begin; begin; insert into log values ('foo'); \i update_foo.sql commit; begin; insert into log values ('bar'); \i update_bar.sql commit; commit; In between the inner transactions I could check 'log' to see if everything went through and take appropriate action. Now client applications have the luxury of being able to check the return code of the query execution call, but SQL only scripts can't. This would be perfectly acceptable: -- update_production.sql begin; savepoint foo; \i update_foo.sql rollback to savepoint foo [if I failed only]; savepoint bar; \i update_bar.sql rollback to savepoint foo [if I failed only]; commit; This would be just great for scripts but would also help client side programming a bit by introducing more flexible error handling behaviors without having to handle things via the returned sql error code. The on errors bit I was talking about earlier is just syntax sugar but the critical part is being able to recover transactions partially without external handler... merlin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Heikki Linnakangas wrote: > Russell Smith wrote: > > 2. Index cleanup is the most expensive part of vacuum. So doing a > > partial vacuum actually means more I/O as you have to do index cleanup > > more often. > > I don't think that's usually the case. Index(es) are typically only a > fraction of the size of the table, and since 8.2 we do index vacuums in > a single scan in physical order. In fact, in many applications the index > is be mostly cached and the index scan doesn't generate any I/O at all. Are _all_ the indexes cached? I would doubt that. Also, for typical table, what percentage is the size of all indexes combined? -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] savepoint improvements
On Mon, 2007-01-22 at 09:25 -0500, Merlin Moncure wrote: > On 1/21/07, Tom Lane <[EMAIL PROTECTED]> wrote: > > "Jaime Casanova" <[EMAIL PROTECTED]> writes: > > > On 1/21/07, Simon Riggs <[EMAIL PROTECTED]> wrote: > > >> - continue on error i.e. COMMIT can/might succeed - though there are > > >> still cases where it cannot, such as a serializable exception. > > > > > and what should be the behaviour of that? the same as rollback? > > > > The only conceivable implementation is an implicit savepoint issued > > before each statement. > > I'm not sure I agree here...before the NT implementation was changed > over to savepoint syntax it was perfectly possible to recover from > errors inside a transaction...and is still possible in plpgsql > functions only. What I'm asking for is to reopen this behavior > somehow...in the production environments I've worked in application > update and maintenance relied heavily on scripting, and lack of this > functionality forces me to wrap the script launch with C code to work > around limitations of the savepoint system. Could you post an example, just so we're all clear what the problems are? I thought I understood what you are requesting; I may not. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Piggybacking vacuum I/O
I've been looking at the way we do vacuums. The fundamental performance issue is that a vacuum generates nheapblocks+nindexblocks+ndirtyblocks I/Os. Vacuum cost delay helps to spread the cost like part payment, but the total is the same. In an I/O bound system, the extra I/O directly leads to less throughput. Therefore, we need to do less I/O. Dead space map helps by allowing us to skip blocks that don't need vacuuming, reducing the # of I/Os to 2*ndirtyblocks+nindexblocks. That's great, but it doesn't help us if the dead tuples are spread uniformly. If we could piggyback the vacuum I/Os to the I/Os that we're doing anyway, vacuum wouldn't ideally have to issue any I/O of its own. I've tried to figure out a way to do that. Vacuum is done in 3 phases: 1. Scan heap 2. Vacuum index 3. Vacuum heap Instead of doing a sequential scan, we could perform the 1st phase by watching the buffer pool, scanning blocks for dead tuples when they're in memory and keeping track of which pages we've seen. When all pages have been seen, the tid list is sorted and 1st phase is done. In theory, the index vacuum could also be done that way, but let's assume for now that indexes would be scanned like they are currently. The 3rd phase can be performed similarly to the 1st phase. Whenever a page enters the buffer pool, we check the tid list and remove any matching tuples from the page. When the list is empty, vacuum is complete. Of course, there's some issues in the design as described above. For example, the vacuum might take a long time if there's cold spots in the table. In fact, a block full of dead tuples might never be visited again. A variation of the scheme would be to keep scanning pages that are in cache, until the tid list reaches a predefined size, instead of keeping track of which pages have already been seen. That would deal better with tables with hot and cold spots, but it couldn't advance the relfrozenid because there would be no guarantee that all pages are visited. Also, we could start 1st phase of the next vacuum, while we're still in the 3rd phase of previous one. Also, after we've seen 95% of the pages or a timeout expires, we could fetch the rest of them with random I/O to let the vacuum finish. I'm not sure how exactly this would be implemented. Perhaps bgwriter or autovacuum would do it, or a new background process. Presumably the process would need access to relcache. One issue is that if we're trying to vacuum every table simultaneously this way, we'll need more overall memory for the tid lists. I'm hoping there's a way to implement this without requiring shared memory for the tid lists, that would make the memory management a nightmare. Also, we'd need changes to bufmgr API to support this. This would work nicely with the DSM. The list of pages that need to be visited in phase 1 could be initialized from the DSM, largely avoiding the problem with cold spots. Any thoughts before I start experimenting? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] savepoint improvements
On 1/22/07, Csaba Nagy <[EMAIL PROTECTED]> wrote: On Sat, 2007-01-20 at 18:08, Merlin Moncure wrote: [snip] > To be honest, I'm not a huge fan of psql tricks (error recovery being > another example) but this could provide a solution. in your opnion, > how would you use \if to query the transaction state? Wouldn't it make sense to introduce instead something like: \set language plpgsql ... and then redirect to plpgsql all you type ? That would give you the possibility to execute things in your favorite language directly from psql without creating a function. The nature of pl/pgsql would make this impossible, or at least highly complex and difficult...one reason is that the language has a much more complex internal state than sql. Most other languages that I think this would be worthwhile already their own immediate execution interpreters. merlin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] savepoint improvements
On 1/21/07, Tom Lane <[EMAIL PROTECTED]> wrote: "Jaime Casanova" <[EMAIL PROTECTED]> writes: > On 1/21/07, Simon Riggs <[EMAIL PROTECTED]> wrote: >> - continue on error i.e. COMMIT can/might succeed - though there are >> still cases where it cannot, such as a serializable exception. > and what should be the behaviour of that? the same as rollback? The only conceivable implementation is an implicit savepoint issued before each statement. I'm not sure I agree here...before the NT implementation was changed over to savepoint syntax it was perfectly possible to recover from errors inside a transaction...and is still possible in plpgsql functions only. What I'm asking for is to reopen this behavior somehow...in the production environments I've worked in application update and maintenance relied heavily on scripting, and lack of this functionality forces me to wrap the script launch with C code to work around limitations of the savepoint system. In pure SQL, we have a 'begin' statement equivalent but no 'end' statement. Why not? merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Strange file in snapshot tarball
On Mon, Jan 22, 2007 at 08:00:26PM +0900, ITAGAKI Takahiro wrote: > There is a file 'compat_informix-dec_test-OpenBSD3.8.broken.stdout' > under the path 'src/interfaces/ecpg/test/extedted/' in the recent > ftp snapshot (postgresql-snapshot.tar.gz). > > All of the other files are placed under 'postgresql-snapshot/...'. > Is this intentional or a mistake? Would you please care to explain? I do not understand what you are saying. [EMAIL PROTECTED]:~$ tar -ztvf postgresql-snapshot.tar.gz |grep OpenBSD -rw-r--r-- pgsql/pgsql 32004 2007-01-12 12:31 postgresql-snapshot/src/interfaces/ecpg/test/expected/compat_informix-dec_test-OpenBSD3.8.broken.stdout -rw-r--r-- pgsql/pgsql 81524 2007-01-12 12:31 postgresql-snapshot/src/interfaces/ecpg/test/expected/pgtypeslib-num_test2-OpenBSD3.8.broken.stdout I don't see the difference here. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Russell Smith wrote: 2. Index cleanup is the most expensive part of vacuum. So doing a partial vacuum actually means more I/O as you have to do index cleanup more often. I don't think that's usually the case. Index(es) are typically only a fraction of the size of the table, and since 8.2 we do index vacuums in a single scan in physical order. In fact, in many applications the index is be mostly cached and the index scan doesn't generate any I/O at all. I believe the heap scans are the biggest issue at the moment. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Strange file in snapshot tarball
Hello, There is a file 'compat_informix-dec_test-OpenBSD3.8.broken.stdout' under the path 'src/interfaces/ecpg/test/extedted/' in the recent ftp snapshot (postgresql-snapshot.tar.gz). All of the other files are placed under 'postgresql-snapshot/...'. Is this intentional or a mistake? Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] STOP all user access except for admin for a few minutes?
Am Montag, 22. Januar 2007 10:32 schrieb [EMAIL PROTECTED]: > Is is possible to stop all user access to postgres, but still give access > to admin? Make the appropriate adjustments to pg_hba.conf. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] STOP all user access except for admin for a few minutes?
[EMAIL PROTECTED] wrote: Hi there, Is is possible to stop all user access to postgres, but still give access to admin? Just temporarily, not a security setup. Something like, stop all users but allow user x and y. You could restart in single user mode, or alter pg_hba.conf to allow the users you want and disallow all other users. single user mode will require you have direct access to the machine to do the alterations. using pg_hba.conf will not disconnect existing users as far as I'm aware. That's the best advice I can offer, maybe somebody else will be able to give you more thx ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] STOP all user access except for admin for a few minutes?
Hi there, Is is possible to stop all user access to postgres, but still give access to admin? Just temporarily, not a security setup. Something like, stop all users but allow user x and y. thx ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] savepoint improvements
On Sat, 2007-01-20 at 18:08, Merlin Moncure wrote: [snip] > To be honest, I'm not a huge fan of psql tricks (error recovery being > another example) but this could provide a solution. in your opnion, > how would you use \if to query the transaction state? Wouldn't it make sense to introduce instead something like: \set language plpgsql ... and then redirect to plpgsql all you type ? That would give you the possibility to execute things in your favorite language directly from psql without creating a function. Cheers, Csaba. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [GENERAL] Autovacuum Improvements
On Sun, 2007-01-21 at 14:26 -0600, Jim C. Nasby wrote: > On Sun, Jan 21, 2007 at 11:39:45AM +, Heikki Linnakangas wrote: > > Russell Smith wrote: > > >Strange idea that I haven't researched, Given Vacuum can't be run in a > > >transaction, it is possible at a certain point to quit the current > > >transaction and start another one. There has been much chat and now a > > >TODO item about allowing multiple vacuums to not starve small tables. > > >But if a big table has a long running vacuum the vacuum of the small > > >table won't be effective anyway will it? If vacuum of a big table was > > >done in multiple transactions you could reduce the effect of long > > >running vacuum. I'm not sure how this effects the rest of the system > > >thought. > > > > That was fixed by Hannu Krosing's patch in 8.2 that made vacuum to > > ignore other vacuums in the oldest xmin calculation. > > And IIRC in 8.1 every time vacuum finishes a pass over the indexes it > will commit and start a new transaction. err...It doesn't do this now and IIRC didn't do that in 8.1 either. > That's still useful even with > Hannu's patch in case you start a vacuum with maintenance_work_mem too > small; you can abort the vacuum some time later and at least some of the > work it's done will get committed. True, but not recommended, though for a variety of reasons. The reason is not intermediate commits, but just that the work of VACUUM is mostly non-transactional in nature, apart from the various catalog entries when it completes. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] savepoint improvements
On Sun, 2007-01-21 at 13:28 -0500, Tom Lane wrote: > "Jaime Casanova" <[EMAIL PROTECTED]> writes: > > On 1/21/07, Simon Riggs <[EMAIL PROTECTED]> wrote: > >> - continue on error i.e. COMMIT can/might succeed - though there are > >> still cases where it cannot, such as a serializable exception. > > > and what should be the behaviour of that? the same as rollback? No. The behaviour is to continue the transaction even though an error has occurred, i.e. BEGIN; 1. INSERT... success 2. INSERT VALUES () () () --fails with error on 3rd VALUES statement dynamically re-construct INSERT statement with remaining 2 VALUES statements 3. INSERT VALUES () (); success COMMIT; work done by 1 and 3 is committed Behaviour needs to support any error at (2) except serializable exceptions. > The only conceivable implementation is an implicit savepoint issued > before each statement. Perhaps the only acceptable one. > By and large that seems to me to be most easily > handled on the client side, and many of our client libraries already > have the ability to do it. PL/pgSQL supports EXCEPTIONs, but no other clients support it, AFAICS. > (For instance, psql has ON_ERROR_ROLLBACK.) Thats not the same thing, regrettably. > If we tried to do it on the server side, we would break any client > software that wasn't prepared for the change of behavior --- see the 7.3 > autocommit fiasco for an example. Only if we changed the default behaviour, which I am not suggesting. > So as far as the server is concerned, I see no TODO here. If the server team won't allow it, we must document that this behaviour must be a client-side function in the *server* TODO, so that all the various client projects can read the same TODO item and implement it. "Implement continue-on-error transactional behaviour for each client library". -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster