Re: [HACKERS] 8.5 release timetable, again
Alvaro Herrera wrote: This seems a good idea. Possibly pushing the betas more aggresively to current users would make them tested not only by PG hackers ... Isn't this the purpose of the new alpha releases, at lease to some extent. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visibility map, partial vacuums
Tom Lane wrote: However, my comment above was too optimistic, because in an insert-only scenario autovac would in fact not trigger VACUUM at all, only ANALYZE. So it seems like we do indeed want to rejigger autovac's rules a bit to account for the possibility of wanting to apply vacuum to get visibility bits set. I'm sure I'm missing something, but I thought the point of this was to lessen the impact of VACUUM and now you are suggesting that we have to add vacuums to tables that have never needed one before. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Block-level CRC checks
Aidan Van Dyk wrote: * Greg Stark [EMAIL PROTECTED] [081117 03:54]: I thought of saying that too but it doesn't really solve the problem. Think of what happens if someone sets a hint bit on a dirty page. If the page is dirty from a real change, then it has a WAL backup block record already, so the torn-page on disk is going to be fixed with the wal replay ... *because* of the torn-page problem already being solved in PG. You don't get the hint-bits back, but that's no different from the current state. But nobody's previously cared if hint-bits wern't set on WAL replay. What if all changes to a page (even hit bits) are WAL logged when running with Block-level CRC checks enables, does that make things easier? I'm sure it would result in some performance loss, but anyone enabling Block Level CRCs is already trading some performance for safety. Thoughts? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Tom Lane wrote: Decibel! [EMAIL PROTECTED] writes: I think that's pretty seriously un-desirable. It's not at all uncommon for databases to stick around for a very long time and then jump ahead many versions. I don't think we want to tell people they can't do that. Of course they can do that --- they just have to do it one version at a time. Also, people may be less likely to stick with an old outdated version for years and years if the upgrade process is easier. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Kevin Grittner wrote: An idea for a possible enhancement to PostgreSQL: allow creation of a temporary table without generating any disk I/O. (Creating and dropping a three-column temporary table within a database transaction currently generates about 150 disk writes). If some circumstances don't make it feasible to always do this as a RAM-only operation, perhaps a clause could be added to the CREATE TEMPORARY TABLE syntax to specify this behavior along with whatever limitations on the temporary table are required for this to work. (For example, maybe this is only feasible if the table will be dropped by the end of the transaction?) As someone else already pointed out you can put temp tables on a RAM disk, but the larger issue is that temp tables still cause system table churn which will always need to be on stable media. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED
Tom Lane wrote: Andrew Chernow [EMAIL PROTECTED] writes: Be careful. From LockFileEx docs: However, the time it takes for the operating system to unlock these locks depends upon available system resources. Therefore, it is recommended that your process explicitly unlock all files it has locked when it terminates. If this is not done, access to these files may be denied if the operating system has not yet unlocked them. ROTFL ... so to translate: If your program crashes, please release locks before crashing. Obviously that wasn't the intent of the above, but I guess it is the net effect. Either way, I don't think it's a huge problem, it just means that PG may not be able to restart for a few seconds until the OS has time to clean-up the locks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] September CommitFest Closed
Josh Berkus wrote: For the September commitfest, 29 patches were applied (one to pgFoundry) and 18 patches were sent back for more work. More importantly, six *new* reviewers completed reviews of of various patches: Abbas Butt, Alex Hunsaker, Markus Wanner, Ibrar Ahmed, Ryan Bradetich and Gianni Colli. Several other new reviewers volunteered, but we ran out of patches to check, and Kenneth Marshall had to quit reviewing because of Hurricane Ike. Yay, reviewers. If nothing else ever came of the Commit Fest approach, if it creates more reviewers, then I think the Commit Fest process would be a success. I think the Commit Fest approach does this by lowering the bar of entry to become a reviewer. It does this because it brings more focus to patch review on a regular basis and since it changes the patch review process from a last minute dash that only experience hackers should get involved with to a process with more time before the final deadline, hence more friendly for new reviewers to get involved. Good news all around! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] VACUUM Improvements - WIP Patch
Joshua D. Drake wrote: Merlin Moncure wrote: Well, there doesn't seem to be a TODO for partial/restartable vacuums, which were mentioned upthread. This is a really desirable feature for big databases and removes one of the reasons to partition large tables. I would agree that partial vacuums would be very useful. I think everyone agrees that partial vacuums would be useful / *A Good Thing* but it's the implementation that is the issue. I was thinking about Alvaro's recent work to make vacuum deal with TOAST tables separately, which is almost like a partial vacuum since it effectively splits the vacuum work up into multiple independent blocks of work, the limitation obviously being that it can only split the work around TOAST. Is there anyway that vacuum could work per relfile since we already split tables into files that are never greater than 1G? I would think that if Vacuum never had more than 1G of work to do at any given moment it would make it much more manageable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] VACUUM Improvements - WIP Patch
Tom Lane wrote: Matthew T. O'Connor [EMAIL PROTECTED] writes: I think everyone agrees that partial vacuums would be useful / *A Good Thing* but it's the implementation that is the issue. I'm not sure how important it will really be once we have support for dead-space-map-driven vacuum. Is that something we can expect any time soon? I haven't heard much about it really happening for 8.4. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?
Tom Lane wrote: Greg Sabino Mullane [EMAIL PROTECTED] writes: Code outside of core, is, in reality, less reviewed, less likely to work well with recent PG versions, and more likely to cause problems. It's also less likely to be found by people, less likely to be used by people, and less likely to be included by distros. Not to say that everything should get shoved into core, of course, but there are strong arguments for both sides. These are all true statements, of course, but ISTM they should be looked on as problems to be solved. Pushing stuff into core instead of solving these problems is not a scalable long-term answer. A few random thoughts... The application that comes to mind first for me when you talk plugins is Firefox. They make it very easy to browse for plugins and to install, update, remove them. Their plug-in system also tries to account for Firefox version and OS platform which we would need to do also. Perhaps one thing that would help PostgreSQL plug-ins is a nice GUI plug-in browser and management application. The logical place to add this IMHO is PGAdmin since it is GUI, already talks to the DB and is cross platform. I'm not saying a GUI should be required to manage plug-ins, a fully CLI option should be made available too. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent VACUUM and ANALYZE
Jonah H. Harris wrote: On Mon, Jul 21, 2008 at 10:19 PM, Tom Lane [EMAIL PROTECTED] wrote: I don't find this a compelling argument, at least not without proof that the various vacuum-improvement projects already on the radar screen (DSM-driven vacuum, etc) aren't going to fix your problem. Is DSM going to be in 8.4? The last I had heard, DSM+related improvements weren't close to being guaranteed for this release. If it doesn't make it, waiting another year and a half for something easily fixed would be fairly unacceptable. Should I provide a patch in the event that DSM doesn't make it? Can't hurt to submit a patch. Also, could you do something to help mitigate the worse case, something like don't update the stats in pg_class if the analyze finishes after a vacuum has finished since the current analyze started? Matt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuuming leaked temp tables (once again)
Tom Lane wrote: We might have to rearrange the logic a bit to make that happen (I'm not sure what order things get tested in), but a log message does seem like a good idea. I'd go for logging anytime an orphaned table is seen, and dropping once it's past the anti-wraparound horizon. Is there an easy way for an Admin clean-up the lost temp tables that autovacuum is complaining about? It seems like it could be along time and a lot of log messages between when they are first orphaned and and finally dropped due to anti-wraparound protection. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XIDs and big boxes again ...
Hans-Juergen Schoenig wrote: i suggest to introduce a --with-long-xids flag which would give me 62 / 64 bit XIDs per vacuum on the entire database. this should be fairly easy to implement. i am not too concerned about the size of the tuple header here - if we waste 500 gb of storage here i am totally fine. As you say later in the thread, you are on 8.1. Alot of work has gone into reducing the effect, impact and frequency of XID wrap around and vacuuming since then. In 8.3 transactions that don't actually update a table no long use a real XID and autovacuum you no longer need a database wide vacuum to solve the XID wraparound problem, so I think the answer is upgrade to 8.3 and see if you still have this problem. Matthew O'Connor -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posting to hackers and patches lists
Alex Hunsaker wrote: In fact I would argue -patches should go away so we dont have that split. +1I think the main argument for the split is to keep the large patch emails off the hackers list, but I don't think that limit is so high that it's a problem. People have to gzip their patches to the patches list fairly often anyway. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posting to hackers and patches lists
Alex Hunsaker wrote: A big part of my problem with the split is if there is a discussion taking place on -hackers I want to be able to reply to the discussion and say well, here is what I was thinking. Sending it to -patches first waiting for it to hit the archive so I can link to it in my reply on -hackers seems pointless and convoluted. But if thats what you want, thats what ill try to do from now on :) For instance the patch Tom reviewed of mine yesterday only -hackers was copied, so I maintained that but also added -patches because I was sending in a patch... I think It will be an ongoing problem though especially for new people as they probably wont understand the logical split... Patches are an integral part of the conversation about development, I think trying to split them up is awkward at best. Do people really still think that the potential for larger messages is really a problem? By the way, what is the actual size limit on hackers vs patches. I would imagine that most patches would already fit in the current hackers limit, especially since you can gzip. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First steps with 8.3 and autovacuum launcher
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: This is an interesting idea, but I think it's attacking the wrong problem. To me, the problem here is that an ANALYZE should not block CREATE INDEX or certain forms of ALTER TABLE. I doubt that that will work; in particular I'm pretty dubious that you can safely make CREATE INDEX and VACUUM run together. Since they'd be unlikely to be using the identical OldestXmin horizon, you'd likely end up with dangling index entries (ie, CREATE INDEX indexes a tuple that the VACUUM removes shortly afterward). I think the main issue is ANALYZE not VACUUM (at least in this thread) since it's DB load times that are in question. Can CREATE INDEX and ANALYZE be made to run concurrently? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] First steps with 8.3 and autovacuum launcher
Tom Lane wrote: If you insist on crafting a solution that only fixes this problem for pg_restore's narrow usage, you'll be back revisiting it before beta1 has been out a month. I don't know much about what is involved in crafting these solutions, but it seems we're close to beta and probably don't want to make drastic changes to anything. As such it seems to me that solving the problem for analyze is a nice piece of low-hanging fruit that solves an immediate problem that has been reported. I would think that reducing the locking implications of VACUUM is much more involved, no? Also, I would think that the fix for ANALYZE will be totally different than the fix for VACUUM no? Are you proposing that we solve the VACUUM locking problem before we release 8.3? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] More logging for autovacuum
Gregory Stark wrote: I'm having trouble following what's going on with autovacuum and I'm finding the existing logging insufficient. In particular that it's only logging vacuum runs *after* the vacuum finishes makes it hard to see what vacuums are running at any given time. Also, I want to see what is making autovacuum decide to forgo vacuuming a table and the log with that information is at DEBUG2. So I would like to suggest adding two options: log_autovacuum_jobs - output every time a vacuum or analyze *starts* log_autovacuum_level - set the log level for the autovacuum process I would also suggest raising the level of the DEBUG2 message indicating why tables were chosen or not. At least to DEBUG1 if not to INFO. Am I missing anything? Are there ways to get this info already that I'm missing? I imagine it would be pretty simple to add these and I'll be happy to do it and send the patch to -patches assuming others (Alvaro? :) agree. I think this sounds good. There was talk a while ago about need a special log level setting just for autovacuum, but nobody did the leg work. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Still recommending daily vacuum...
Alvaro Herrera wrote: Matthew T. O'Connor wrote: Well, if a table has 10 rows, and we keep the current threshold of 1000 rows, then this table must have 1002 dead tuples (99% dead tuples, 1002 dead + 10 live) before being vacuumed. This seems wasteful because there are 500 dead tuples on it and only 10 live tuples. So each scan must wade through all the dead tuples. Another small table with 100 tuples will be vacuumed on every iteration as well, even if there are just two dead tuples. So you are right -- maybe dropping it all the way to 0 is too much. But a small value of 10 is reasonable? That will make the 10 tuple table be vacuumed when there are 10 dead tuples (50% of dead tuples), and the 100 tuple table when there are 11 (11% of dead tuples). It decreases quickly to the scale factor (2%, or do we want to decrease it to 1%?) I think it's probably fine. I think, that the optimal number for the base_threhold is probably dependant on the width of the row, for a very narrow row where you might have many on the same page, 20 or 50 might be right, but for a very wide table a smaller number might be optimal, however I think it probably doesn't matter much anyway. Reducing the default to 10 seems fine, and perhaps even removing it as a tuning knob. I think there are too many autovacuum knobs and it confuses people. Is it too late to possibly remove this GUC altogether? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Still recommending daily vacuum...
Alvaro Herrera wrote: Jim C. Nasby wrote: FWIW, I normally go with the 8.2 defaults, though I could see dropping vacuum_scale_factor down to 0.1 or 0.15. I also think the thresholds could be decreased further, maybe divide by 10. How about pushing thresholds all the way down to 0? As long as it handles small (or zero row) tables ok then yes. The base_threshold in the originial contrib autovacuum was just an easy way to not vacuum really small tables too often. If a table has only 10 rows, it's going to get vacuumed every time one row is updated. I guess that's not a big problem with a table that small but still seems excessive. If you think this isn't a problem with the current autovacuum, then sure turn it down to zero, and perhaps we can even get rid of it altogether in another release or two. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] COPYable logs status
Andrew Dunstan wrote: The situation with this patch is that I now have it in a state where I think it could be applied, but there is one blocker, namely that we do not have a way of preventing the interleaving of log messages from different backends, which leads to garbled logs. This is an existing issue about which we have had complaints, but it becomes critical for a facility the whole purpose of which is to provide logs in a format guaranteed to work with our COPY command. Unfortunately, there is no solution in sight for this problem, certainly not one which I think can be devised and implemented simply at this stage of the cycle. The solution we'd like to use, LWLocks, is not workable in his context. In consequence, I don't think we have any option but to shelve this item for the time being. I think this will get shot down, but here goes anyway... How about creating a log-writing-process? Postmaster could write to the log files directly until the log-writer is up and running, then all processes can send their log output through the log-writer. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] COPYable logs status
Tom Lane wrote: Matthew T. O'Connor [EMAIL PROTECTED] writes: How about creating a log-writing-process? Postmaster could write to the log files directly until the log-writer is up and running, then all processes can send their log output through the log-writer. We *have* a log-writing process. The problem is in getting the data to it. By that I assume you mean the bgwriter, I thought that was for WAL data, I didn't think it could or perhaps should be used for normal log file writing, but I also know I'm way outside my comfort area in talking about this, so excuse the noise if this is way off base. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Autovacuum launcher doesn't notice death of postmaster immediately
Alvaro Herrera wrote: Jim C. Nasby escribió: There *is* reason to allow setting the naptime smaller, though (or at least there was; perhaps Alvero's recent changes negate this need): clusters that have a large number of databases. I've worked with folks who are in a hosted environment and give each customer their own database; it's not hard to get a couple hundred databases that way. Setting the naptime higher than a second in such an environment would mean it could be hours before a database is checked for vacuuming. Yes, the code in HEAD is different -- each database will be considered separately. So the huge database taking all day to vacuum will not stop the tiny databases from being vacuumed in a timely manner. And the very huge table in that database will not stop the other tables in the database from being vacuumed either. There can be more than one worker in a single database. Ok, but I think the question posed is that in say a virtual hosting environment there might be say 1,000 databases in the cluster. Am I still going to have to wait a long time for my database to get vacuumed? I don't think this has changed much no? (If default naptime is 1 minute, then autovacuum won't even look at a given database but once every 1,000 minutes (16.67 hours) assuming that there isn't enough work to keep all the workers busy.) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] [BUGS] BUG #3326: Invalid lower bound of autovacuum_cost_limit
Alvaro Herrera wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: But this is misleading (started postmaster with good value, then edited postgresql.conf and entered -2): 17903 LOG: received SIGHUP, reloading configuration files 17903 LOG: -2 is outside the valid range for parameter autovacuum_vacuum_cost_limit (-1 .. 1000) Note how it still says the range is -1 .. 1000. Can we redefine things to make zero be the disabled value, thus keeping the range of valid values contiguous? That would be another solution ... though it would be different from the valid value for autovacuum_vacuum_cost_delay (on which 0 is a valid value). Also it would be a different value from previous versions. I don't think either of these is a showstopper, so let's go for that if nobody objects. Can you make 0 and -1 both valid disabled values? That way it will be compatible with previous releases. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Autovacuum launcher doesn't notice death of postmaster immediately
Tom Lane wrote: Andrew Hammond [EMAIL PROTECTED] writes: Hmmm... it seems to me that points new users towards not using autovacuum, which doesn't seem like the best idea. I think it'd be better to say that setting the naptime really high is a Bad Idea. It seems like we should have an upper limit on the GUC variable that's less than INT_MAX ;-). Would an hour be sane? 10 minutes? This is independent of the problem at hand, though, which is that we probably want the launcher to notice postmaster death in less time than autovacuum_naptime, for reasonable values of same. Do we need a configurable autovacuum naptime at all? I know I put it in the original contrib autovacuum because I had no idea what knobs might be needed. I can't see a good reason to ever have a naptime longer than the default 60 seconds, but I suppose one might want a smaller naptime for a very active system? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [RFC] GSoC Work on readonly queries done so far
Florian G. Pflug wrote: Work done so far: - .) Don't start autovacuum and bgwriter. Do table stats used by the planner get replicated on a PITR slave? I assume so, but if not, you would need autovac to do analyzes. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [RFC] GSoC Work on readonly queries done so far
Alvaro Herrera wrote: Simon Riggs wrote: On Wed, 2007-06-06 at 12:17 -0400, Matthew T. O'Connor wrote: Florian G. Pflug wrote: Work done so far: - .) Don't start autovacuum and bgwriter. Do table stats used by the planner get replicated on a PITR slave? I assume so, but if not, you would need autovac to do analyzes. The replication is an exact block-level replication of the master. We can't write very much at all on the slave. Hmm, something to keep in mind is forcing cache invals when the master causes them (for example relation cache, catalog caches and plan caches). Perhaps if you are as PITR master and you have active readonly slaves then there should be a WAL record to note plan invalidations, etc? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Autovacuum versus rolled-back transactions
Tom Lane wrote: ITAGAKI Takahiro [EMAIL PROTECTED] writes: Our documentation says | analyze threshold = analyze base threshold | + analyze scale factor * number of tuples | is compared to the total number of tuples inserted, updated, or deleted | since the last ANALYZE. but deleted tuples are not considered in the total number, because the delta of {n_live_tuples + n_dead_tuples} is not changed by DELETE. We add the number of DELETE into n_live_tuples and subtract it from n_dead_tuples. Yeah, I was concerned about that when I was making the patch, but didn't see any simple fix. A large number of DELETEs (without any inserts or updates) would trigger a VACUUM but not an ANALYZE, which in the worst case would be bad because the stats could have shifted. We could fix this at the cost of carrying another per-table counter in the stats info, but I'm not sure it's worth it. I believe that whenever autovacuum performs a VACUUM it actually performs a VACUUM ANALYZE at leas the old contrib version did and I think Alvaro copied that. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] What X86/X64 OS's do we need coverage for?
Larry Rosenman wrote: I might use that as the base then, since the hardware finishes getting here tomorrow. The other thing to consider is that CentOS 5 has Xen built right in, so you should be able run VMs without VMWare on it. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Documentation access problems.
Joshua D. Drake wrote: The big thing for me, is a single document, zero clicks, that is searchable. PDF and plain text are the only thing that give me that. If you are really zealous you can even use Beagle (which I don't) to preindex the PDF for you for easy searching. Lots of projects publish their HTML docs in two formats: One Big HTML file with everything; Broken up into many HTML files that link to each other. This would allow you you have one big searchable document. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Documentation access problems.
Bruce Momjian wrote: Tom Lane wrote: Matthew T. O'Connor matthew@zeut.net writes: Lots of projects publish their HTML docs in two formats: One Big HTML file with everything; Broken up into many HTML files that link to each other. This would allow you you have one big searchable document. The key word there being big ;-) ... I don't have any problem with making such a version available on the website, but I don't think shipping two versions of the HTML docs in our tarballs is reasonable. I think having the single HTML file version available on our web site is enough. Agreed. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum next steps, take 3
My initial reaction is that this looks good to me, but still a few comments below. Alvaro Herrera wrote: Here is a low-level, very detailed description of the implementation of the autovacuum ideas we have so far. launcher's dealing with databases - [ Snip ] launcher and worker interactions [Snip] worker to-do list - When each worker starts, it determines which tables to process in the usual fashion: get pg_autovacuum and pgstat data and compute the equations. The worker then takes a snapshot of what's currently going on in the database, by storing worker PIDs, the corresponding table OID that's being currently worked, and the to-do list for each worker. Does a new worker really care about the PID of other workers or what table they are currently working on? It removes from its to-do list the tables being processed. Finally, it writes the list to disk. Just to be clear, the new worker removes from it's todo list all the tables mentioned in the todo lists of all the other workers? The table list will be written to a file in PGDATA/vacuum/database-oid/todo.worker-pid The file will consist of table OIDs, in the order in which they are going to be vacuumed. At this point, vacuuming can begin. This all sounds good to me so far. Before processing each table, it scans the WorkerInfos to see if there's a new worker, in which case it reads its to-do list to memory. It's not clear to me why a worker cares that there is a new worker, since the new worker is going to ignore all the tables that are already claimed by all worker todo lists. Then it again fetches the tables being processed by other workers in the same database, and for each other worker, removes from its own in-memory to-do all those tables mentioned in the other lists that appear earlier than the current table being processed (inclusive). Then it picks the next non-removed table in the list. All of this must be done with the Autovacuum LWLock grabbed in exclusive mode, so that no other worker can pick the same table (no IO takes places here, because the whole lists were saved in memory at the start.) Again it's not clear to me what this is gaining us? It seems to me that if when a worker starts up writes out it's to-do list, it should just do it, I don't see the value in workers constantly updating their todo lists. Maybe I'm just missing something can you enlighten me? other things to consider This proposal doesn't deal with the hot tables stuff at all, but that is very easy to bolt on later: just change the first phase, where the initial to-do list is determined, to exclude cold tables. That way, the vacuuming will be fast. Determining what is a cold table is still an exercise to the reader ... I think we can make this algorithm naturally favor small / hot tables with one small change. Having workers remove tables that they just vacuumed from their to-do lists and re-write their todo lists to disk. Assuming the todo lists are ordered by size ascending, smaller tables will be made available for inspection by newer workers sooner rather than later. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] autovacuum next steps, take 3
Tom Lane wrote: Matthew T. O'Connor matthew@zeut.net writes: It's not clear to me why a worker cares that there is a new worker, since the new worker is going to ignore all the tables that are already claimed by all worker todo lists. That seems wrong to me, since it means that new workers will ignore tables that are scheduled for processing by an existing worker, no matter how far in the future that schedule extends. As an example, suppose you have half a dozen large tables in need of vacuuming. The first worker in will queue them all up, and subsequent workers will do nothing useful, at least not till the first worker is done with the first table. Having the first worker update its todo list file after each table allows the earlier tables to be exposed for reconsideration, but that's expensive and it does nothing for later tables. Well the big problem that we have is not that large tables are being starved, so this doesn't bother me too much, plus there is only so much IO, so one worker working sequentially through the big tables seems OK to me. I suggest that maybe we don't need exposed TODO lists at all. Rather the workers could have internal TODO lists that are priority-sorted in some way, and expose only their current table OID in shared memory. Then the algorithm for processing each table in your list is 1. Grab the AutovacSchedule LWLock exclusively. 2. Check to see if another worker is currently processing that table; if so drop LWLock and go to next list entry. 3. Recompute whether table needs vacuuming; if not, drop LWLock and go to next entry. (This test covers the case where someone vacuumed the table since you made your list.) 4. Put table OID into shared memory, drop LWLock, then vacuum table. 5. Clear current-table OID from shared memory, then repeat for next list entry. This creates a behavior of whoever gets to it first rather than allowing workers to claim tables that they actually won't be able to service any time soon. Right, but you could wind up with as many workers working concurrently as you have tables in a database which doesn't seem like a good idea either. One thing I like about the todo list setup Alvaro had is that new workers will be assigned fewer tables to work on and hence exit sooner. We are going to fire off a new worker every autovac_naptime so availability of new workers isn't going to be a problem. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum next steps, take 2
Jim C. Nasby wrote: On Tue, Feb 27, 2007 at 12:00:41AM -0300, Alvaro Herrera wrote: Jim C. Nasby wrote: The advantage to keying this to autovac_naptime is that it means we don't need another GUC, but after I suggested that before I realized that's probably not the best idea. For example, I've seen clusters that are running dozens-hundreds of databases; in that environment you really need to turn naptime way down (to like a second). In that case you wouldn't want to key to naptime. Actually, I've been thinking that it would be a good idea to change the semantics of autovacuum_naptime so that it means the average time to start a worker in any given database. That way, the time between autovac runs is not dependent on the number of databases you have. BTW, another issue that I don't think we can ignore: we actually need to do this on a per-tablespace level, or at least have the ability to disable or somehow limit it. While it's not common, there are users that run a hundred or more databases in a single cluster; it would be ugly if we suddenly had 100 vacuums trying to run on the same set of drives concurrently. I think we all agree that autovacuum needs to become tablespace aware at some point, but I think that is further down the line, we're having enough trouble figuring things out without that additional complication. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Resumable vacuum proposal and design overview
Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Tue, 2007-02-27 at 10:37 -0600, Jim C. Nasby wrote: ... The idea would be to give vacuum a target run time, and it would monitor how much time it had remaining, taking into account how long it should take to scan the indexes based on how long it's been taking to scan the heap. When the amount of time left becomes less than the estimate of the amount of time required to scan the indexes (and clean the heap), you stop the heap scan and start scanning indexes. I do like this idea, but it also seems easy to calculate that bit yourself. Run VACUUM, after X minutes issue stop_vacuum() and see how long it takes to finish. Adjust X until you have it right. One problem with it is that a too-small target would result in vacuum proceeding to scan indexes after having accumulated only a few dead tuples, resulting in increases (potentially enormous ones) in the total work needed to vacuum the table completely. I think it's sufficient to have two cases: abort now, and restart from the last cycle-completion point next time (this would basically just be SIGINT); or set a flag to stop at the next cycle-completion point. It occurs to me that we may be thinking about this the wrong way entirely. Perhaps a more useful answer to the problem of using a defined maintenance window is to allow VACUUM to respond to changes in the vacuum cost delay settings on-the-fly. So when your window closes, you don't abandon your work so far, you just throttle your I/O rate back to whatever's considered acceptable for daytime vacuuming. I thought we already did that? Which BTW was part of my plan on how to deal with a vacuum that is still running after it's maintenance window has expired. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum next steps, take 2
Alvaro Herrera wrote: Jim C. Nasby wrote: That's why I'm thinking it would be best to keep the maximum size of stuff for the second worker small. It probably also makes sense to tie it to time and not size, since the key factor is that you want it to hit the high-update tables every X number of seconds. If we wanted to get fancy, we could factor in how far over the vacuum threshold a table is, so even if the table is on the larger size, if it's way over the threshold the second vacuum will hit it. Ok, I think we may be actually getting somewhere. Me too. I propose to have two different algorithms for choosing the tables to work on. The worker would behave differently, depending on whether there is one or more workers on the database already or not. The first algorithm is the plain threshold equation stuff we use today. If a worker connects and determines that no other worker is in the database, it uses the plain worker mode. A worker in this mode would examine pgstats, determine what tables to vacuum/analyze, sort them by size (smaller to larger), and goes about its work. This kind of worker can take a long time to vacuum the whole database -- we don't impose any time limit or table size limit to what it can do. Right, I like this. The second mode is the hot table worker mode, enabled when the worker detects that there's already a worker in the database. In this mode, the worker is limited to those tables that can be vacuumed in less than autovacuum_naptime, so large tables are not considered. Because of this, it'll generally not compete with the first mode above -- the tables in plain worker were sorted by size, so the small tables were among the first vacuumed by the plain worker. The estimated time to vacuum may be calculated according to autovacuum_vacuum_delay settings, assuming that all pages constitute cache misses. How can you determine what tables can be vacuumed within autovacuum_naptime? I agree that large tables should be excluded, but I don't know how we can do that calculation based on autovacuum_naptime. So at: t=0*autovacuume_naptime: worker1 gets started on DBX t=1*autovacuume_naptime: worker2 gets started on DBX worker2 determines all tables that need to be vacuumed, worker2 excludes tables that are too big from it's to-do list, worker2 gets started working, worker2 exits when it either: a) Finishes it's entire to-do-list. b) Catches up to worker1 I think the questions are 1) What is the exact math you are planning on using to determine which tables are too big? 2) Do we want worker2 to exit when it catches worker1 or does the fact that we have excluded tables that re too big mean that we don't have to worry about this? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] autovacuum next steps, take 2
Alvaro Herrera wrote: Matthew T. O'Connor wrote: How can you determine what tables can be vacuumed within autovacuum_naptime? My assumption is that pg_class.relpages * vacuum_cost_page_miss * vacuum_cost_delay = time to vacuum This is of course not the reality, because the delay is not how long it takes to fetch the pages. But it lets us have a value with which we can do something. With the default values, vacuum_cost_delay=10, vacuum_cost_page_miss=10, autovacuum_naptime=60s, we'll consider tables of under 600 pages, 4800 kB (should we include indexes here in the relpages count? My guess is no). I'm not sure how pg_class.relpages is maintained but what happens to a bloated table? For example, a 100 row table that is constantly updated and hasn't been vacuumed in a while (say the admin disabled autovacuum for a while), now that small 100 row table has 1000 pages in it most of which are just bloat, will we miss this table? Perhaps basing this on reltuples would be better? A table over 600 pages does not sound like a good candidate for hot, so this seems more or less reasonable to me. On the other hand, maybe we shouldn't tie this to the vacuum cost delay stuff. I'm not sure it's a good idea to tie this to the vacuum cost delay settings either, so let me as you this, how is this better than just allowing the admin to set a new GUC variable like autovacuum_hot_table_size_threshold (or something shorter) which we can assign a decent default of say 8MB. Thoughts? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] autovacuum next steps, take 2
Alvaro Herrera wrote: Matthew T. O'Connor wrote: I'm not sure how pg_class.relpages is maintained but what happens to a bloated table? For example, a 100 row table that is constantly updated and hasn't been vacuumed in a while (say the admin disabled autovacuum for a while), now that small 100 row table has 1000 pages in it most of which are just bloat, will we miss this table? Perhaps basing this on reltuples would be better? Well, this would only happen the first time, until the plain worker processed the table; next time it would be picked up by the hot table worker. But yeah, we can build a better estimate using the same trick the planner uses: estimate tuple density as reltuples/relpages times the actual number of blocks on disk. Well even skipping it the first time isn't good, anything that further delays a hot table from getting vacuumed is bad. Also, I'm not sure it would just be the first time since plain VACUUM isn't going to reclaim most of the space, just mark it as reusable. This is moot however if we use a good metric, I have no idea if what you suggest above would be good enough. A table over 600 pages does not sound like a good candidate for hot, so this seems more or less reasonable to me. On the other hand, maybe we shouldn't tie this to the vacuum cost delay stuff. I'm not sure it's a good idea to tie this to the vacuum cost delay settings either, so let me as you this, how is this better than just allowing the admin to set a new GUC variable like autovacuum_hot_table_size_threshold (or something shorter) which we can assign a decent default of say 8MB. Yeah, maybe that's better -- it's certainly simpler. Simple is better, at least until proven otherwise. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum next steps, take 2
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Matthew T. O'Connor wrote: I'm not sure it's a good idea to tie this to the vacuum cost delay settings either, so let me as you this, how is this better than just allowing the admin to set a new GUC variable like autovacuum_hot_table_size_threshold (or something shorter) which we can assign a decent default of say 8MB. Yeah, maybe that's better -- it's certainly simpler. I'm not liking any of these very much, as they seem critically dependent on impossible-to-tune parameters. I think it'd be better to design this around having the first worker explicitly expose its state (list of tables to process, in order) and having subsequent workers key off that info. The shared memory state could include the OID of the table each worker is currently working on, and we could keep the to-do list in some simple flat file for instance (since we don't care about crash safety). So far we are only talking about one parameter, the hot_table_size_threshold, which I agree would be a guess by an admin, but if we went in this direction, I would also advocate adding a column to the pg_autovacuum table that allows an admin to explicitly define a table as hot or not. Also I think each worker should be mostly independent, the only caveat being that (assuming each worker works in size order) if we catch up to an older worker (get to the table they are currently working on) we exit. Personally I think this is all we need, but others felt the additional threshold was needed. What do you think? Or what do you think might be better? I'm not certain exactly what key off needs to mean; perhaps each worker should make its own to-do list and then discard items that are either in-progress or recently done by another worker when it gets to them. My initial design didn't have any threshold at all, but others felt this would/could result in too many worker working concurrently in the same DB. I think an absolute minimum requirement for a sane design is that no two workers ever try to vacuum the same table concurrently, and I don't see where that behavior will emerge from your proposal; whereas it's fairly easy to make it happen if non-first workers pay attention to what other workers are doing. Maybe we never made that clear, I was always working on the assumption that two workers would never try to work on the same table at the same time. BTW, it's probably necessary to treat shared catalogs specially ... Certainly. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] autovacuum next steps, take 2
Jim C. Nasby wrote: On Mon, Feb 26, 2007 at 06:23:22PM -0500, Matthew T. O'Connor wrote: I'm not sure how pg_class.relpages is maintained but what happens to a bloated table? For example, a 100 row table that is constantly updated and hasn't been vacuumed in a while (say the admin disabled autovacuum for a while), now that small 100 row table has 1000 pages in it most of which are just bloat, will we miss this table? Perhaps basing this on reltuples would be better? The entire point of this is to ensure that the second daemon will only vacuum tables that it can finish very quickly. If you let a table bloat so it's too big, then you just can't vacuum it very frequently without risking all your other hot tables bloating because they're no longer getting vacuumed. The reality is that you can actually vacuum a pretty good-sized table in 60 seconds with typical cost-delay settings (ie: defaults except cost_delay set to 10). That means you can do 9 pages ~100 times a second, or 54k pages a minute. Even with a vacuum_cost_delay of 20, that's still 27k pages per minute. At the risk of sounding like a broken record, I still think the size limit threshold is unnecessary. Since all workers will be working in on tables in size order, younger workers will typically catch older workers fairly quickly since the tables will be either small, or recently vacuumed and not need work. And since younger workers exit when they catch-up to an older worker, there is some inherent stability in the number of workers. Here is a worst case example: A DB with 6 tables all of which are highly active and will need to be vacuumed constantly. While this is totally hypothetical, it is how I envision things working (without the threshold). table1:10 rows table2: 100 rows table3: 1,000 rows table4:10,000 rows table5: 100,000 rows table6: 1,000,000 rows time=0*naptime: No workers in the DB time=1*naptime: worker1 starts on table1 time=2*naptime: worker1 has finished table1,table2 and table3, it's now working on table4, worker2 starts on table1. time=3*naptime: worker1 is on table5, worker2 is working on table4, worker3 starts on table1. time=4*naptime: worker1 is still on table5, worker2 has caught up to worker1 and exits, worker3 also catches up to worker1 since tables2-4 didn't require vacuum at this time so it exits, worker4 starts on table1 time=5*naptime: worker1 is working on table6, worker4 is up to table4, worker5 starts on table1 time=6*naptime: worker1 is working on table6, worker4 catches up to worker1 and exits, worker5 finds no additional work to be done and exits, worker6 starts at table1. time=7*naptime: worker1 still working on table6, worker6 is up to table4, worker7 starts at table1. time=8*naptime: worker1 still working on table6, worker6 still working on table4, worker7 working on table3, worker8 starting on table1. time=9*naptime: worker1 still working on table6, worker6 working on table5, worker7 catches worker 6 and exits, worker8 finds nothing more todo and exits, worker9 starts on table1 time=10*naptim: worker1 still working on table6, worker9 working on table4, worker10 starts on table1. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] autovacuum next steps, take 2
Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: The real problem is trying to set that up in such a fashion that keeps hot tables frequently vacuumed; Are we assuming that no single worker instance will vacuum a given table more than once? (That's not a necessary assumption, certainly, but without it there are so many degrees of freedom that I'm not sure how it should act.) Given that assumption, the maximum vacuuming rate for any table is once per autovacuum_naptime, and most of the magic lies in the launcher's algorithm for deciding which databases to launch workers into. Yes, I have been working under the assumption that a worker goes through the list of tables once and exits, and yes the maximum vacuuming rate for any table would be once per autovacuum_naptime. We can lower the default if necessary, as far as I'm concerned it's (or should be) fairly cheap to fire off a worker and have it find that there isn't anything todo and exit. I'm inclined to propose an even simpler algorithm in which every worker acts alike; its behavior is 1. On startup, generate a to-do list of tables to process, sorted in priority order. 2. For each table in the list, if the table is still around and has not been vacuumed by someone else since you started (including the case of a vacuum-in-progress), then vacuum it. That is what I'm proposing except for one difference, when you catch up to an older worker, exit. This has the benefit reducing the number of workers concurrently working on big tables, which I think is a good thing. Detecting already vacuumed since you started is a bit tricky; you can't really rely on the stats collector since its info isn't very up-to-date. That's why I was thinking of exposing the to-do lists explicitly; comparing those with an advertised current-table would allow accurate determination of what had just gotten done. Sounds good, but I have very little insight into how we would implement already vacuumed since you started or have I caught up to another worker. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] autovacuum next steps, take 2
Tom Lane wrote: BTW, to what extent might this whole problem be simplified if we adopt chunk-at-a-time vacuuming (compare current discussion with Galy Lee)? If the unit of work has a reasonable upper bound regardless of table size, maybe the problem of big tables starving small ones goes away. So if we adopted chunk-at-a-time then perhaps each worker processes the list of tables in OID order (or some unique and stable order) and does one chunk per table that needs vacuuming. This way an equal amount of bandwidth is given to all tables. That does sounds simpler. Is chunk-at-a-time a realistic option for 8.3? Matt ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] autovacuum next steps, take 2
Tom Lane wrote: Matthew T. O'Connor matthew@zeut.net writes: Tom Lane wrote: I'm inclined to propose an even simpler algorithm in which every worker acts alike; That is what I'm proposing except for one difference, when you catch up to an older worker, exit. No, that's a bad idea, because it means that any large table starves even-larger tables. True, but the assumption I'm making is that there is a finite amount of bandwidth available and more concurrent activity will have a net negative effect the time it takes to vacuum all tables. I'm willing to pay that price to prevent small hot tables from getting starved, but less willing to pay the same price for large tables where the percentage of bloat will be much smaller. (Note: in all this I assume we're all using size as a shorthand for some sort of priority metric that considers number of dirty tuples not only size. We don't want every worker insisting on passing over every small read-only table every time, for instance.) I was using size to mean reltuples. The whole concept of sorting by size was designed to ensure that smaller (more susceptible to bloat) tables got priority. It might be useful for workers to sort their to-do lists by some other metric, but I don't have a clear vision of what that might be. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] autovacuum next steps, take 2
Tom Lane wrote: Matthew T. O'Connor matthew@zeut.net writes: That does sounds simpler. Is chunk-at-a-time a realistic option for 8.3? It seems fairly trivial to me to have a scheme where you do one fill-workmem-and-scan-indexes cycle per invocation, and store the next-heap-page-to-scan in some handy place (new pg_class column updated along with relpages/reltuples, likely). Galy is off in left field with some far more complex ideas :-( but I don't see that there's all that much needed to support this behavior ... especially if we don't expose it to the SQL level but only support it for autovac's use. Then we're not making any big commitment to support the behavior forever. Well, if we can make it happen soon, it might be the best thing for autovacuum. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] autovacuum next steps, take 2
Jim C. Nasby wrote: On Mon, Feb 26, 2007 at 10:18:36PM -0500, Matthew T. O'Connor wrote: Jim C. Nasby wrote: Here is a worst case example: A DB with 6 tables all of which are highly active and will need to be vacuumed constantly. While this is totally hypothetical, it is how I envision things working (without the threshold). I fail to see how a simple 6 table case is 'worst case'. It's common to see hundreds of tables, and I've run across more than one database with thousands of tables (think partitioning). In cases like those it's certainly possible, perhaps even likely that you would get many daemons running in the database at one time just from different tables suddenly needing vacuuming and appearing at a higher point in the list than other tables. With 100 ~1G tables getting updates it certainly wouldn't be hard to end up with 10 of those being vacuumed all at the same time. Yes 6 tables is small, the worst-case part of the example was that all the tables would need to be vacuumed constantly. Most databases only have a few hot tables. Most tables only need to vacuumed every once in a while. I do like the idea since it should be easier to tune, but I think we still need some limit on it. Perhaps as a first-pass we could just have a hard limit and log a message and/or set a flag any time we hit it. That would hopefully allow us to get information about how big a problem it really is. We could go one step further and say that the last daemon that can start in a database will only vacuum tables that can be done quickly; that's essentially what we've been talking about, except the limit we've been discussing would be hard-coded at 2. I'm confused, what limit would be set at 2? The number of concurrent workers? I've never said that. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum next steps, take 2
Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: The proposal to save enough state to be able to resume a vacuum at pretty much any point in it's cycle might work; we'd have to benchmark it. With the default maintenance_work_mem of 128M it would mean writing out 64M of state every minute on average, which is likely to take several seconds to fsync (though, maybe we wouldn't need to fsync it...) Which is exactly why we needn't bother benchmarking it. Even if it weren't complex and unsafe, it will be a net loss when you consider the fact that it adds I/O instead of removing it. I'm not sure what you are saying here, are you now saying that partial vacuum won't work for autovac? Or are you saying that saving state as Jim is describing above won't work? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] autovacuum next steps, take 2
Tom Lane wrote: Matthew T. O'Connor matthew@zeut.net writes: I'm not sure what you are saying here, are you now saying that partial vacuum won't work for autovac? Or are you saying that saving state as Jim is describing above won't work? I'm saying that I don't like the idea of trying to stop on a dime by saving the current contents of vacuum's dead-TID array to disk with the idea that we can trust those values 100% later. Saving the array is expensive both in runtime and code complexity, and I don't believe we can trust it later --- at least not without even more expensive-and- complex measures, such as WAL-logging every such save :-( I'm for stopping only after completing an index-cleaning pass, at the point where we empty the dead-TID array anyway. If you really have to have stop on a dime, just kill -INT the process, accepting that you will have to redo your heap scan since the last restart point. OK, so if I understand correct, a vacuum of a table with 10 indexes on it can be interrupted 10 times, once after each index-cleaning pass? That might have some value, especially breaking up the work required to vacuum a large table. Or am I still not getting it? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] autovacuum next steps, take 2
Jim C. Nasby wrote: On Wed, Feb 21, 2007 at 05:40:53PM -0500, Matthew T. O'Connor wrote: My Proposal: If we require admins to identify hot tables tables, then: 1) Launcher fires-off a worker1 into database X. 2) worker1 deals with hot tables first, then regular tables. 3) Launcher continues to launch workers to DB X every autovac naptime. 4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as worker1 did above. If worker1 is still working in DB X then worker2 looks for hot tables that are being starved because worker1 got busy. If worker2 finds no hot tables that need work, then worker2 exits. Rather than required people to manually identify hot tables, what if we just prioritize based on table size? So if a second autovac process hits a specific database, it would find the smallest table in need of vacuuming that it should be able to complete before the next naptime and vacuum that. It could even continue picking tables until it can't find one that it could finish within the naptime. Granted, it would have to make some assumptions about how many pages it would dirty. ISTM that's a lot easier than forcing admins to mark specific tables. So the heuristic would be: * Launcher fires off workers into a database at a given interval (perhaps configurable?) * Each worker works on tables in size order. * If a worker ever catches up to an older worker, then the younger worker exits. This sounds simple and workable to me, perhaps we can later modify this to include some max_workers variable so that a worker would only exit if it catches an older worker and there are max_workers currently active. Thoughts? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] autovacuum next steps, take 2
Jim C. Nasby wrote: On Thu, Feb 22, 2007 at 09:32:57AM -0500, Matthew T. O'Connor wrote: So the heuristic would be: * Launcher fires off workers into a database at a given interval (perhaps configurable?) * Each worker works on tables in size order. * If a worker ever catches up to an older worker, then the younger worker exits. This sounds simple and workable to me, perhaps we can later modify this to include some max_workers variable so that a worker would only exit if it catches an older worker and there are max_workers currently active. That would likely result in a number of workers running in one database, unless you limited how many workers per database. And if you did that, you wouldn't be addressing the frequently update table problem. A second vacuum in a database *must* exit after a fairly short time so that we can go back in and vacuum the important tables again (well or the 2nd vacuum has to periodically re-evaluate what tables need to be vacuumed). I'm not sure this is a great idea, but I don't see how this would result in large numbers of workers working in one database. If workers work on tables in size order, and exit as soon as they catch up to an older worker, I don't see the problem. Newer works are going to catch-up to older workers pretty quickly since small tables will vacuum fairly quickly. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum next steps, take 2
Alvaro Herrera wrote: Ok, scratch that :-) Another round of braindumping below. I still think this is solution in search of a problem. The main problem we have right now is that hot tables can be starved from vacuum. Most of this proposal doesn't touch that. I would like to see that problem solved first, then we can talk about adding multiple workers per database or per tablespace etc... (This idea can be complemented by having another GUC var, autovacuum_hot_workers, which allows the DBA to have more than one worker on hot tables (just for the case where there are too many hot tables). This may be overkill.) I think this is more along the lines of what we need first. Ron Mayer expressed the thought that we're complicating needlessly the UI for vacuum_delay, naptime, etc. He proposes that instead of having cost_delay etc, we have a mbytes_per_second parameter of some sort. This strikes me a good idea, but I think we could make that after this proposal is implemented. So this take 2 could be implemented, and then we could switch the cost_delay stuff to using a MB/s kind of measurement somehow (he says waving his hands wildly). Agree this is probably a good idea in the long run, but I agree this is lower on the priority list and should come next. Greg Stark and Matthew O'Connor say that we're misdirected in having more than one worker per tablespace. I say we're not :-) If we consider Ron Mayer's idea of measuring MB/s, but we do it per tablespace, then we would inflict the correct amount of vacuum pain to each tablespace, sleeping as appropriate. I think this would require workers of different databases to communicate what tablespaces they are using, so that all of them can utilize the correct amount of bandwidth. I agree that in the long run it might be better to have multiple workers with MB/s throttle and tablespace aware, but we don't have any of that infrastructure right now. I think the piece of low-hanging fruit that your launcher concept can solve is the hot table starvation. My Proposal: If we require admins to identify hot tables tables, then: 1) Launcher fires-off a worker1 into database X. 2) worker1 deals with hot tables first, then regular tables. 3) Launcher continues to launch workers to DB X every autovac naptime. 4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as worker1 did above. If worker1 is still working in DB X then worker2 looks for hot tables that are being starved because worker1 got busy. If worker2 finds no hot tables that need work, then worker2 exits. This seems a very simple solution (given your launcher work) that can solve the starvation problem. Thoughts? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] autovacuum next steps
Alvaro Herrera wrote: After staring at my previous notes for autovac scheduling, it has become clear that this basics of it is not really going to work as specified. So here is a more realistic plan: [Snip Detailed Description] How does this sound? On first blush, I'm not sure I like this as it doesn't directly attack the table starvation problem, and I think it could be a net loss of speed. VACUUM is I/O bound, as such, just sending multiple vacuum commands at a DB isn't going to make things faster, you are now going to have multiple processes reading from multiple tables at the same time. I think in general this is a bad thing (unless we someday account for I/O made available from multiple tablespaces). In general the only time it's a good idea to have multiple vacuums running at the same time is when a big table is starving a small hot table and causing bloat. I think we can extend the current autovacuum stats to add one more column that specifies is hot or something to that effect. Then when the AV launcher sends a worker to a DB, it will first look for tables marked as hot and work on them. While working on hot tables, the launcher need not send any additional workers to this database, if the launcher notices that a worker is working on regular tables, it can send another worker which will look for hot tables to working, if the worker doesn't find any hot tables that need work, then it exits leaving the original working to continue plodding along. Thoughts? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] autovacuum next steps
Alvaro Herrera wrote: Matthew T. O'Connor wrote: On first blush, I'm not sure I like this as it doesn't directly attack the table starvation problem, and I think it could be a net loss of speed. VACUUM is I/O bound, as such, just sending multiple vacuum commands at a DB isn't going to make things faster, you are now going to have multiple processes reading from multiple tables at the same time. I think in general this is a bad thing (unless we someday account for I/O made available from multiple tablespaces). Yeah, I understand that. However, I think that can be remedied by using a reasonable autovacuum_vacuum_cost_delay setting, so that each worker uses less than the total I/O available. The main point of the proposal is to allow multiple workers on a DB while also allowing multiple databases to be processed in parallel. So you are telling people to choose an autovacuum_delay so high that they need to run multiple autovacuums at once to keep up? I'm probably being to dramatic, but it seems inconsistent. I think we can extend the current autovacuum stats to add one more column that specifies is hot or something to that effect. Then when the AV launcher sends a worker to a DB, it will first look for tables marked as hot and work on them. While working on hot tables, the launcher need not send any additional workers to this database, if the launcher notices that a worker is working on regular tables, it can send another worker which will look for hot tables to working, if the worker doesn't find any hot tables that need work, then it exits leaving the original working to continue plodding along. How would you define what's a hot table? I wasn't clear, I would have the Admin specified it, and we can store it as an additional column in the pg_autovacuum_settings table. Or perhaps if the table is below some size threshold and autovacuum seems that it needs to be vacuumed every time it checks it 10 times in a row or something like that. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] autovacuum process handling
Alvaro Herrera wrote: This is how I think autovacuum should change with an eye towards being able to run multiple vacuums simultaneously: [snip details] Does this raise some red flags? It seems straightforward enough to me; I'll submit a patch implementing this, so that scheduling will continue to be as it is today. Thus the scheduling discussions are being deferred until they can be actually useful and implementable. I can't really speak to the PostgreSQL signaling innards, but this sound logical to me. I think having the worker processes be children of the postmaster and having them be single-minded (or single-tasked) also makes a lot of sense. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Autovacuum Improvements
Alvaro Herrera wrote: I'd like to hear other people's opinions on Darcy Buskermolen proposal to have a log table, on which we'd register what did we run, at what time, how long did it last, how many tuples did it clean, etc. I feel having it on the regular text log is useful but it's not good enough. Keep in mind that in the future we may want to peek at that collected information to be able to take better scheduling decisions (or at least inform the DBA that he sucks). I'm not familiar with his proposal, but I'm not sure what I think of logging vacuum (and perhaps analyze) commands to a table. We have never logged anything to tables inside PG. I would be worried about this eating a lot of space in some situations. I think most people would just be happy if we could get autovacuum to log it's actions at a much higher log level. I think that autovacuum vacuumed table x is important and shouldn't be all the way down at the debug level. The other (more involved) solution to this problem was proposed which was create a separate set of logging control params for autovacuum so that you can turn it up or down independent of the general server logging. Now, I'd like this to be a VACUUM thing, not autovacuum. That means that manually-run vacuums would be logged as well. +1 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Autovacuum improvements
Alvaro Herrera wrote: Matthew T. O'Connor wrote: This still seems ambiguous to me, how would I handle a maintenance window of Weekends from Friday at 8PM though Monday morning at 6AM? My guess from what said is: mon dom dow starttime endtime null null6 20:00 null null null1 null 06:00 So how do we know to vacuum on Saturday or Sunday? I think clearly defined intervals with explicit start and stop times is cleaner. mon dom dow start end nullnull5 20:00 23:59:59 nullnull6 00:00 23:59:59 nullnull7 00:00 23:59:59 nullnull1 00:00 06:00 (1 = monday, 5 = friday) So it takes 4 lines to handle one logical interval, I don't really like that. I know that your concept of interval groups will help mask this but still. Now I'm starting to wonder what will happen between 23:59:59 of day X and 00:00:00 of day (X+1) ... Maybe what we should do is not specify an end time, but a duration as an interval: month int dom int dow int start time durationinterval That way you can specify the above as mon dom dow start duration nullnull5 20:00 (4 hours + 2 days + 6 hours) Now, if a DST boundary happens to fall in that interval you'll be an hour short, or it'll last an hour too long :-) I certainly like this better than the first proposal, but I still don't see how it's better than a full set of columns for start and end times. Can you tell me why you are trying to avoid that design? Hmm... this seems like queue is nearly a synonym for group. Can't we just add num_workers property to table groups? That seems to accomplish the same thing. And yes, a GUC variable to limits the total number of concurrent autovacuums is probably a good idea. queue = group of groups. But I'm not sure about this at all, which is why I took it away from the proposal. I think we can live without the groups of groups, at least for now. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Autovacuum improvements
First, thanks for working on this. I hope to be helpful with the design discussion and possibly some coding if I can find the time. My initial reaction to this proposal is that it seems overly complex, however I don't see a more elegant solution. I'm a bit concerned that most users won't figure out all the knobs. Alvaro Herrera wrote: I've been thinking how to improve autovacuum so that we can convince more people that it can be enabled by default. I would like to see it enabled by default too, however the reason it isn't already enabled by default is that it caused failures in the regression test when we tried to turn it on during the 8.2 dev cycle and it was too close to beta to fix everything. All this new machinery is great, but it doesn't address that problem. Here are my thoughts. There are two areas of improvements: 1. scheduling, and 2. process handling, i.e., how to have multiple vacuum processes running at any time. Fail enough, but I would say the two biggest area for improvement are scheduling and preventing HOT tables from becoming vacuum starved (essentially what you said, but with a different emphasis). [snip] Process Handling My idea here is to morph the current autovacuum daemon from an agent that itself runs a vacuum command, into something that launches other processes to run those vacuum commands. I'll call this the autovacuum launcher process, or the launcher for short. The idea here is that the launcher can take care of the scheduling while the worker processes do their work. If the launcher then determines that a particular instant there should be two vacuums running, then it simply starts two worker processes. How about calling it the autovacuum_master process? [snip autovacuum launcher process description] That all sounds reasonable to me. I think the harder part is what you are getting at below (how to get the launcher to figure out what to vacuum when). Scheduling == We introduce the following concepts: 1. table groups. We'll have a system catalog for storing OID and group name, and another catalog for membership, linking relid to group OID. pg_av_tablegroup tgrname name pg_av_tgroupmembers groupid oid relid oid 2. interval groups. We'll have a catalog for storing igroup name and OID, and another catalog for membership. We identify an interval by: - month of year - day of month - day of week - start time of day - end time of day This is modelled after crontabs. pg_av_intervalgroup igrnamename pg_av_igroupmembers groupidoid month int domint dowint starttime timetz endtimetimetz This seems to assume that the start and end time for an interval will be on the same day, you probably need to specify a start month, dom, dow, time and an end month, dom, dow and time. Since this is modeled after cron, do we allow wild-cards, or any of the other cron tricks like */20 or 1-3,5,7,9-11? Also your notation above is ambiguous, it took me a while to realize that pg_av_igroupmembers.groupid wasn't referencing the id from pg_av_tablegroup. Additionally, we'll have another catalog on which we'll store table groups to interval groups relationships. On that catalog we'll also store those autovacuum settings that we want to be able to override: whether to disable it for this interval group, or the values for the vacuum/analyze equations. pg_av_schedule tgroup oid igroup oid enabledbool queue int vac_base_threshint vac_scale_factor float anl_base_threshint anl_scal_factorfloat vac_cost_delay int vac_cost_limit int freeze_min_age int freeze_max_age int What is queue for? So the scheduler, at startup, loads the whole schedule in memory, and then wakes up at reasonable intervals and checks whether these equations hold for some of the tables it's monitoring. If they do, then launch a new worker process to do the job. We need a mechanism for having the scheduler rescan the schedule when a user modifies the catalog -- maybe having a trigger that sends a signal to the process is good enough (implementation detail: the signal must be routed via the postmaster, since the backend cannot hope to know the scheduler's PID. This is easy enough to do.) This all looks reasonable if not a bit complex. Question, what happens to the current pg_autovacuum relation? Also what about system defaults, will we have a hard coded default interval of always on, and one default table group that contains all the tables with one default entry in pg_av_schedule? I think we need more discussion on scheduling, we need to make sure this solves the vacuum starvation problem. Does the launcher process consider each row in pg_av_schedule that
Re: [HACKERS] Autovacuum improvements
Alvaro Herrera wrote: Matthew T. O'Connor wrote: Alvaro Herrera wrote: pg_av_igroupmembers groupid oid month int dom int dow int starttime timetz endtime timetz This seems to assume that the start and end time for an interval will be on the same day, you probably need to specify a start month, dom, dow, time and an end month, dom, dow and time. Actually, I was thinking that if you want intervals that cross day boundaries, you just add more tuples (one which finishes at 23:59:59 and another which starts at 00:00:00 the next day). This still seems ambiguous to me, how would I handle a maintenance window of Weekends from Friday at 8PM though Monday morning at 6AM? My guess from what said is: mon dom dow starttime endtime null null6 20:00 null null null1 null 06:00 So how do we know to vacuum on Saturday or Sunday? I think clearly defined intervals with explicit start and stop times is cleaner. This all looks reasonable if not a bit complex. Question, what happens to the current pg_autovacuum relation? I had two ideas: one was to make pg_autovacuum hold default config for all tables not mentioned in any group, so sites which are OK with 8.2's representation can still use it. The other idea was to remove it and replace it with this mechanism. Probably best to just get rid of it. GUC variables hold the defaults and if we create a default interval / group, it will also have defaults. I think we need more discussion on scheduling, we need to make sure this solves the vacuum starvation problem. Does the launcher process consider each row in pg_av_schedule that applies at the current time separately? That is say there are three entries in pg_av_schedule that apply right now, does that mean that the launcher can fire off three different vacuums? Perhaps we need to add a column to pg_av_tablegroup that specifies the max number of concurrent worker processes for this table group. My idea was to assign each table, or maybe each group, to a queue, and then have as much workers as there are queues. So you could put them all in a single queue and it would mean there can be at most one vacuum running at any time. Or you could put each group in a queue, and then there could be as many workers as there are groups. Or you could mix. And also there would be a autovac concurrency limit, which would be a GUC var saying how many vacuums to have at any time. Hmm... this seems like queue is nearly a synonym for group. Can't we just add num_workers property to table groups? That seems to accomplish the same thing. And yes, a GUC variable to limits the total number of concurrent autovacuums is probably a good idea. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pdfs of the conference
Walter Cruz wrote: The larger version is only hidden from everyone :) http://people.planetpostgresql.org/mha/uploads/photo/conf/conference_group.jpg http://people.planetpostgresql.org/mha/uploads/photo/conf/conference_group.jpg Very cool, I was hoping someone would post this. Any chance we can get a list of names to go with the faces? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] autovacuum causing numerous regression-test failures
Peter Eisentraut wrote: Tom Lane wrote: Not a solution for make installcheck, Well, for make installcheck we don't have any control over whether autovacuum has been turned on or off manually anyway. If you are concerned about build farm reliability, the build farm scripts can surely be made to initialize or start the instance in a particular way. Another option might be to turn off stats_row_level on the fly. I'm sure I'm missing some of the subtleties of make installcheck issues, but autovacuum can be enabled / disabled on the fly just as easily as stats_row_level, so I don't see the difference? Or we put manual exceptions for the affected tables into pg_autovacuum. New feature? Or does that capability exist already? I haven't ever used the pg_autovacuum table but the documentation certainly makes one believe that this is possible. Right, if it doesn't work, that would certainly be a bug. This feature was included during the original integration into the backend during the 8.0 dev cycle. Let's just consider some of the options a bit more closely, and if they don't work, we'll revert it. Agreed. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [Pgsqlrpms-hackers] Safer auto-initdb for RPM init
Jim C. Nasby wrote: On Sat, Aug 26, 2006 at 01:32:17PM -0700, Joshua D. Drake wrote: I am not exactly sure why we initdb at all. IMHO it would be better if the start script just checked if there was a cluster. If not, it wouldn't start, it would error with: You do not have a cluster, please read man page on initdb. As Tom mentioned, it's for newbie-friendliness. While I can understand that, I think it needs to be easy to shut that off. I understand that, but it seems the whole problem of people overwriting there data dir is because we initdb from the start script. If we installed the datadir during the RPM install, it would still be newbie friendly and would removed initdb from start script solving that problem. The only downside is larger disk foot-print for something that a user may never use. Given the size of todays hard drives, and the size of a standard RedHat install, I don't think that is much of an issue. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [Pgsqlrpms-hackers] Safer auto-initdb for RPM init
Joshua D. Drake wrote: Matthew T. O'Connor wrote: Jim C. Nasby wrote: As Tom mentioned, it's for newbie-friendliness. While I can understand that, I think it needs to be easy to shut that off. I understand that, but it seems the whole problem of people overwriting there data dir is because we initdb from the start script. If we installed the datadir during the RPM install, it would still be newbie friendly and would removed initdb from start script solving that problem. initdb will not overwrite an existing installation. Poorly chosen words. I meant, the problem where the start script will create a new data dir when it doesn't see that one exists even though one actually does exist it's just not available at the moment. Either way, if the start scripts never created a data dir, then there is no problem. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Autovacuum on by default?
Peter Eisentraut wrote: Summarizing this thread, I see support for the following: - autovacuum set to on by default in 8.2. Yes. - stats_row_level also defaults to on. Yes. (Perhaps stats_block_level should also default to on so it's not inconsistent, seeing that everything else in on, too.) I haven't followed this closely, but are all the other stats commands other than block_level now on by default? In general I would vote to keep it off if not needed just for performance reasons, though I haven't measured the effect of block_level turned on. Anyone measured this? - Delayed vacuum and delayed autovacuum will stay disabled. Unfortunately. - Scale factor set to 0.08 (vacuum) and 0.04 (analyze) (?) (formerly 0.4 and 0.2) That seems a big jump. BTW, I know .08 and .04 were suggested, but I didn't see confirmation that it was a good idea. I know my initial values were grossly over-conservative, but I am concerned about bogging down the server with lots of vacuums, especially since we don't have the delay settings on by default, nor do we have a maintenance windows yet. - Leave base thresholds alone (pending further analysis that might remove them altogether?) While there is talk of removing this all together, I think it was also agreed that as long as these values are there, they should be reduced. I think the defaults in 8.1 are 1000/500, I think 200/100 was suggested. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Autovacuum on by default?
Tom Lane wrote: Matthew T. O'Connor matthew@zeut.net writes: While there is talk of removing this all together, I think it was also agreed that as long as these values are there, they should be reduced. I think the defaults in 8.1 are 1000/500, I think 200/100 was suggested. ISTM that if we don't want to remove the thresholds immediately, we should make them default to zero for a release or two and see how well it works. At the moment I can't find the thread that discussed removing them, but IIRC there were some good arguments why the thresholds should always be zero. I don't have any significant objection to removing them, it just seemed to me that we are late in release cycle and that might be more than we want to do at the moment. If others think it's OK, then it's OK with me. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Autovacuum on by default?
Peter Eisentraut wrote: Am Donnerstag, 17. August 2006 18:40 schrieb Josh Berkus: I'm in favor of this, but do we want to turn on vacuum_delay by default as well? People might complain that suddenly their vacuum runs take four times as long (or whatever). Of course, if we turn on autovacuum and advocate a more or less hands-off vacuum policy, they won't have to care either way. All of this would certainly be release-notes material. Does anyone, for that matter, want to propose possible default parameters for vacuum_delay? I said vacuum_delay but I should have been more specific, there are autovacuum GUC variables which is what we should be talking about. This way manually run, or nighly run by cron vacuums are still as fast as they ever were. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Adjust autovacuum naptime automatically
ITAGAKI Takahiro wrote: Matthew T. O'Connor matthew@zeut.net wrote: What is this based on? That is, based on what information is it deciding to reduce the naptime? If there are some vacuum or analyze jobs, the naptime is shortened (i.e, autovacuum is accelerated). And if there are no jobs, the naptime is lengthened (autovacuum is decelerated). Yeah, I looked through the patch after I sent this email. It's an interesting perspective, but I want to see some performance numbers or significant bloat reduction before I agree this is a good idea. Again, when a table is busy, constant vacuuming will help keep down bloat, but at the expense of throughput. I noticed my method is based on different views from contrib/pg_autovacuum. I'm afraid of the lack of vacuum by autovacuum. So if the database seems to require frequent vacuums, I'll accelerate autovacuum, and vice versa. If we have a small heavily-updated table and a large rarely-updated table, we should vacuum the small one soon after vacuum on the large one is done, even if the large vacuum takes long time. -- but hmm, it may be better to have multiple autovacuums in such a case primarily. Yes, I think we are heading in this direction. As of 8.2 PostgreSQL will allow multiple vacuums at the same time (just not on the same table), autovacuum hasn't been trained on this yet, but I think it will eventually. I agree. We can use autovacuum thresholds and cost-delay parameters to control the frequency and priority of vacuum. I don't think it is good to control vacuums by changing naptime. Now I'm confused, are you now saying that you don't like the concept behind your patch? Or am I misunderstanding. I think your idea might be a good one, I'm just not sure yet. Matt ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Autovacuum on by default?
Peter Eisentraut wrote: Is it time to turn on autovacuum by default in 8.2? I know we wanted to be on the side of caution with 8.1, but perhaps we should evaluate the experiences now. Comments? Would be fine by me, but I'm curious to see what the community has to say. A few comments: Autovacuum can cause unpredictable performance issues, that is if it vacuums in the middle of a busy day and people don't want that, of course they turn it off easy enough, but they might be surprised. I haven't played with CVS HEAD much, but I think the logging issue has been addressed no? That is my single biggest gripe with the 8.1 autovacuum is that it's very hard to see if it's actually done anything without having to turn up the logging significantly. The remaining big ticket items for autovacuum are the maintenance window that Alvaro and I have just been discussing, and multiple concurrent vacuum, (possibly you could add the continued reduction of vacuum impact but that just a constant thing). Do we think it's worth waiting for either of these two features prior to turning on autovacuum by default? Matt ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Autovacuum on by default?
Bruce Momjian wrote: Matthew T. O'Connor wrote: Would be fine by me, but I'm curious to see what the community has to say. A few comments: Autovacuum can cause unpredictable performance issues, that is if it vacuums in the middle of a busy day and people don't want that, of course they turn it off easy enough, but they might be surprised. I haven't played with CVS HEAD much, but I think the logging issue has been addressed no? That is my single biggest gripe with the 8.1 autovacuum is that it's very hard to see if it's actually done anything without having to turn up the logging significantly. This has not been addressed, except that pg_stat_activity now shows autovacuum. Someone was going to work on per-module log output, but it wasn't completed for 8.2. Does pg_stat_activity now show the table being vacuumed? Hmm... I though it had, not the full blown per-module log output stuff, but just a simple reorgainzing of the log levels for autovacuum messages. That is lowering the level for: LOG: autovacuum: processing database foo and increasing the log level when autovacuum actually fires off a VACUUM or ANALYZE command. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Autovacuum on by default?
Rod Taylor wrote: The defaults could be a little more aggressive for both vacuum and analyze scale_factor settings; 10% and 5% respectively. I would agree with this, not sure of 10%/5% are right, but the general feedback I have heard is that while the defaults in 8.1 are much better than the ones in the contrib module, they are still not aggressive enough. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Autovacuum on by default?
Bruce Momjian wrote: Matthew T. O'Connor wrote: and increasing the log level when autovacuum actually fires off a VACUUM or ANALYZE command. This was not done because the logging control only for autovacuum was going to be added. Right now, if you want to see the vacuum activity, you end up seeing debug stuff too (very ugly). Any chance we can make this change before release? I think it's very important to be able to look through the logs and *know* that you tables are getting vacuumed or not. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Autovacuum maintenance window (was Re: Adjust autovacuum
Alvaro Herrera wrote: My vision is a little more complex than that. You define group of tables, and separately you define time intervals. For each combination of group and interval you can configure certain parameters, like a multiplier for the autovacuum thresholds and factors; and also the enable bit. So you can disable vacuum for some intervals, and refine the equation factors for some others. This is all configured in tables, not in GUC, so you have more flexibility in choosing stuff for different groups of tables (say, you really want the small-but-high-update tables to be still vacuumed even during peak periods, but you don't want that big fat table to be vacuumed at all during the same period). That sounds good. I worry a bit that it's going to get overly complex. I suppose if we create the concept of a default window that all new tables will be automatically be added to when created, then out of the box we can create 1 default 24 hour maintenance window that would effectively give us the same functionality we have now. Could we also use these groups to be used for concurrent vacuums? That is autovacuum will loop through each group of tables independently thus allowing multiple simultaneous vacuums on different tables and giving us a solution to the constantly updated table problem. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Autovacuum on by default?
Bruce Momjian wrote: Matthew T. O'Connor wrote: Any chance we can make this change before release? I think it's very important to be able to look through the logs and *know* that you tables are getting vacuumed or not. Agreed. I just IM'ed Alvaro and he says pg_stat_activity should now show exactly what autovacuum is doing (and if it doesn't, let's fix it). I think that is the best solution to the monitoring problem, rather than throwing lines in the server logs. I'm not sure I agree with this. I can use the pg_stat_activity table to see if autovacuum is doing something right now, but what I want to be able to do is look through my logs and see that table_foo hasn't been vacuumed since last week, or that table_bar has been vacuumed 7 times today. Can I do that just with the stat system alone? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Autovacuum on by default?
Josh Berkus wrote: Is it time to turn on autovacuum by default in 8.2? I know we wanted to be on the side of caution with 8.1, but perhaps we should evaluate the experiences now. Comments? I'm in favor of this, but do we want to turn on vacuum_delay by default as well? I thought about this, might be a good idea as it will mitigate the impact of vacuums, however it will also slow down vacuums, I'm a bit concerned that it won't be able to keep up on really large database, or that it'll fall really far behind after vacuuming a big table. Also, if we do enable it, what is a good default? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Adjust autovacuum naptime automatically
Jim C. Nasby wrote: On Thu, Aug 17, 2006 at 03:00:00PM +0900, ITAGAKI Takahiro wrote: IMO, the only reason at all for naptime is because there is a non-trivial cost associated with checking a database to see if any vacuuming is needed. This cost is reduced significantly in the integrated version as compared to the contrib version, but yes still not zero. One problem that I've run across is that in a cluster with a lot of databases it can take a very long time to cycle through all of them. Perhaps a better idea would be to check a number of databases on each pass. That way you won't bog the server down while checking, but it won't take as long to get to all the databases. Also, autovac should immediately continue checking databases after it finishes vacuuming one. The reason for this is that while vacuuming, the vacuum_cost_delay settings will almost certainly be in effect, which will prevent autovac from hammering the system. Since the system won't be hammered during the vacuum, it's ok to check more databases immediately after finishing vacuuming on one. This is basically what Itagaki's patch does. Does anyone have any info on how much load there actually is when checking databases to see if they need vacuuming? I haven't. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Autovacuum on by default?
Jim C. Nasby wrote: On Thu, Aug 17, 2006 at 12:41:57PM -0400, Matthew T. O'Connor wrote: Would be fine by me, but I'm curious to see what the community has to say. A few comments: Autovacuum can cause unpredictable performance issues, that is if it vacuums in the middle of a busy day and people don't want that, of course they turn it off easy enough, but they might be surprised. Anyone in that situation needs to be more hands-on with the database anyway. I think the big target for autovac is beginners who otherwise would bloat themselves silly (of course, it's also very useful far beyond beginners, but by that time an admin should have some clue about what they're doing). Fair enough, also if we turn on the delay setting by default, it will help address this. And +1 on Rod's suggestion to make it more aggressive. I always drop the scale factor to at least 0.2 and 0.1 (though 0.1 and 0.05 don't seem unreasonable), and typically drop the thresholds to 200 and 100 (though again, lower is probably warrented). OK. Actually, on a table small enough for the thresholds to kick in it's going to be extremely fast to vacuum anyway, and the table is probably either static or changing very rapidly. I'm wondering if maybe they should just default to 0? I assume you are suggesting that the base value be 0? Well for one thing if the table doesn't have any rows that will result in constant vacuuming of that table, so it needs to be greater than 0. For a small table, say 100 rows, there usually isn'tn much performance impact if the table if 50% dead space, so I think the base values you suggest are OK, but they shouldn't be 0. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Autovacuum on by default?
Larry Rosenman wrote: Alvaro Herrera wrote: Bruce Momjian wrote: Well, the problem is that it shows what it's *currently* doing, but it doesn't let you know what has happened in the last day or whatever. It can't answer has table foo been vacuumed recently? or what tables haven't been vacuumed at all during this week? I added last vacuum and last analyze (both auto and manual) dates in the 8.2 tree. Hmm... well that does address most of my issue. Doesn't tell me that table_foo is getting vacuumed constantly, but it does tell me that it was vacuumed recently. Might be good enough. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Autovacuum on by default?
Alvaro Herrera wrote: Matthew T. O'Connor wrote: I assume you are suggesting that the base value be 0? Well for one thing if the table doesn't have any rows that will result in constant vacuuming of that table, so it needs to be greater than 0. For a small table, say 100 rows, there usually isn'tn much performance impact if the table if 50% dead space, so I think the base values you suggest are OK, but they shouldn't be 0. Actually Tom suggested some time ago that we should get rid of the base value completely, i.e. make it 0 forever. A row with 0 tables would not show any activity in pgstats, so it would not be vacuumed constantly. Only once after it's truncated. OK, forgot that. Well I put it in originally as a way to give more flexability to the calculation, if I want a tabled vacuumed every 100 updates, then I can set the scaling factor to 0 and the base value to 100, but maybe that's not really needed. It would simplify things if we got rid of it. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] Adjust autovacuum naptime automatically
Alvaro Herrera wrote: ITAGAKI Takahiro wrote: In the case of a heavily update workload, the default naptime (60 seconds) is too long to keep the number of dead tuples low. With my patch, the naptime will be adjusted around 3 seconds at the case of pgbench (scale=10, 80 tps) with default other autovacuum parameters. What is this based on? That is, based on what information is it deciding to reduce the naptime? Interesting. To be frank I don't know what the sleep scale factor was supposed to do. I'm not sure that sleep scale factor is a good idea or not at this point, but what I was thinking back in the day when i originally wrote the contrib autovacuum is that I didn't want the system to get bogged down constantly vacuuming. So, if it just spent a long time working on one database, it would sleep for long time. Given that we can now specify the vacuum cost delay settings for autovacuum and disable tables and everything else, I'm not sure we this anymore, at least not as it was originally designed. It sounds like Itagaki is doing things a little different with his patch, but I'm not sure I understand it. - I removed autovacuum_naptime guc variable, because it is adjusted automatically now. Is it appropriate? I think we should provide the user with a way to stop the naptime from changing at all. Eventually we will have the promised maintenance windows feature which will mean the user will not have to worry at all about the naptime, but in the meantime I think we should keep it. I'm not sure that's true. I believe we will want the naptime GUC option even after we have the maintenance window. I think we might ignore the naptime during the maintenance window, but even after we have the maintenance window, we will still vacuum during the day as required. My vision of the maintenance window has always been very simple, that is, during the maintenance window the thresholds get reduced by some factor (probably a GUC variable) so during the day it might take 1 updates on a table to cause a vacuum but during the naptime it might be 10% of that, 1000. Is this in-line with what others were thinking? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New beginings
Darcy Buskermolen wrote: Dear Community members, It is with great enthuasim I announce that I have accepted an offer from Joshua D. Drake of Command Prompt Inc, to join his team. As former Vice President of Software Development with Wavefire Technologies Corp, I endeavor to leverage over 10 years of technical expertise and 8 years of PostgreSQL background within my new team. Rest assured that I will continue my role within the Slony development group to bring additional features, robustness and usability to the project. I wish all the best to my former co-workers, and plan for a bright future within the ranks of the Command Prompt team. Congrats! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.2 features status
Robert Treat wrote: So, the things I hear most non-postgresql people complain about wrt postgresql are: no full text indexing built in no replication built in no stored procedures (with a mix of wanting in db cron facility) the planner is not smart enough (with a mix of wanting hints) vacuum leads to unpredictable performance FTI is a biggie in my mind. I know it ain't happening for 8.2, but is the general plan to integrate TSearch2 directly into the backend? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] CSV mode option for pg_dump
Bill Bartlett wrote: Can't -- the main production database is over at a CoLo site with access only available via SSH, and tightly-restricted SSH at that. Generally one of the developers will SSH over to the server, pull out whatever data is needed into a text file via psql or pg_dump, scp the file(s) back here and send them to the user. ODBC over an SSH tunnnel? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [pgsql-advocacy] Toward A Positive Marketing Approach.
Tom Lane wrote: Josh Berkus josh@agliodbs.com writes: Other projects need even more intensive coding help. OpenOffice, for example, doesn't offer the Postgres driver by default because it's still too buggy. That seems like something that it'd be worth our while to help fix. +1 (or +10 if that's not to piggy ;-) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Logging pg_autovacuum
I think there are two things people typically want to know from the logs: 1) Is autovacuum running 2) Did autovacuum take action (issue a VACUUM or ANALYZE) I don't think we need mention the name of each and every database we touch, we can, but it should be at a lower level like DEBUG1 or something. I don't know what logging level these thing should go at, but I for one would like them to be fairly high easy to get to, perhaps NOTICE? Matt Bruce Momjian wrote: Tom Lane wrote: [EMAIL PROTECTED] (Bruce Momjian) writes: Change log message about vacuuming database name from LOG to DEBUG1. Prevents duplicate meaningless log messsages. Could we have some discussion about this sort of thing, rather than unilateral actions? Those messages were at LOG level because otherwise it's difficult to be sure from the log that autovac is running at all. OK, so what do we want to do? Clearly outputing something everytime pg_autovacuum touches a database isn't ideal. By default, the server logs should show significant events, which this is not. Do we want something output only the first time autovacuum runs? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] problem with large maintenance_work_mem settings and
Stefan Kaltenbrunner wrote: foo=# set maintenance_work_mem to 200; SET foo=# VACUUM ANALYZE verbose; INFO: vacuuming information_schema.sql_features ERROR: invalid memory alloc request size 204798 Just an FYI, I reported a similar problem on my 8.0.0 database a few weeks ago. I upgraded to 8.1.3 and just recetly got that message again in a nighly maintenance email. vacuumdb: vacuuming database postgres vacuumdb: vacuuming of database postgres failed: ERROR: out of memory DETAIL: Failed on request of size 167772156. The interesting thing is that there aren't any tables in the postgres database. This happened Thursday night, but vacuum ran fine on Friday night. I'm on vacation right now, so I can't really look into it much deeper at the moment, but I thought I would mention it. Matt ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Automatic free space map filling
Alvaro Herrera wrote: Csaba Nagy wrote: Now when the queue tables get 1000 times dead space compared to their normal size, I get performance problems. So tweaking vacuum cost delay doesn't buy me anything, as not vacuum per se is the performance problem, it's long run time for big tables is. So for you it would certainly help a lot to be able to vacuum the first X pages of the big table, stop, release locks, create new transaction, continue with the next X pages, lather, rinse, repeat. I got the impression that Csaba is looking more for multiple simultaneous vacuum more than the partial vacuum. Not sure the best way to set this up, but perhaps a flag in the pg_autovacuum table that says vacuum this table even if there is another vacuum running that way you can control things and not have autovacuum firing off lots of vacuums at the same time. Sounds to me that these frequently updated queue tables need to be monitored closely and not ignored for a long period of time because we are vacuuming another table. Has anyone looked more closely at the multiple vacuum patch that was submitted to the patches list a while ago? Matt ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Automatic free space map filling
Csaba Nagy wrote: So he rather needs Hannu Krosing's patch for simultaneous vacuum ... Well, I guess that would be a good solution to the queue table problem. The problem is that I can't deploy that patch on our production systems without being fairly sure it won't corrupt any data... and I can't rely on non-production testing either. Basically I'm waiting to see Tom saying it will fly :-) That patch is a step forward if it's deemed OK by the powers that be. However, autovacuum would still need to be taught to handle simultaneous vacuums. I suppose that in the interim, you could disable autovacuum for the problematic queue table and have cron issue a manual vacuum command for that table at the required frequency. Anyone up for working on / testing / improving Hannu's patch? I think it's beyond my skill set. Matt ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Automatic free space map filling
Tom Lane wrote: Matthew T. O'Connor matthew@zeut.net writes: That patch is a step forward if it's deemed OK by the powers that be. However, autovacuum would still need to be taught to handle simultaneous vacuums. I suppose that in the interim, you could disable autovacuum for the problematic queue table and have cron issue a manual vacuum command for that table at the required frequency. I'm not sure you should think of that as an interim solution. I don't really like the idea of multiple autovacuums running concurrently. ISTM autovac is intended to be something that lurks in the background and doesn't take up an unreasonable percentage of your system bandwidth ... but if there's more than one of them, it's going to be mighty hard to control the overall load penalty. Plus you have to worry about keeping them off each others' backs, ie, not all trying to vac the same table at once. And in a scenario like Csaba's, I think the hotspot tables are just exactly what they'd all try to vacuum. For small hotspot tables I think a scheduled vacuum process is just the thing, whereas autovac is more of a free-lance thing to keep the rest of your DB in line. While I agree that given the current state of affairs the cron solution is elegant, I personally want autovac to solve all of our vacuuming needs, I really dislike the idea of requiring a cron based solution to solve a fairly typical problem. Besides the cron solution is sloppy, it blindly vacuums whether it's needed or not resulting in a net increase of cycles spent vacuuming. Anyway, I don't know the best way to implement it but I wasn't thinking of just firing off multiple autovac processes. I was envisioning something like an autovacuum master process that launches (forks?) VACUUM commands and has some smarts about how many processes to fire off, or that it would only fire off simultaneous VACUUMS for tables that have been flagged as hot spot tables. I recognize that teaching autovac to handle simultaneous VACUUM's in a sane way will require a quantum leap of complexity but it still seems a better long term solution. I would agree that using cron makes sense if we were seeing lots of different scenarios that we couldn't possibly anticipate, but I don't think that is where we are. BTW, this discussion is only relevant if we allow simultaneous vacuum. Is this something you see as inevitable whether or not you think Hannu's implementation is acceptable. Matt ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Automatic free space map filling
Csaba Nagy wrote From my POV, there must be a way to speed up vacuums on huge tables and small percentage of to-be-vacuumed tuples... a 200 million rows table with frequent updates of the _same_ record is causing me some pain right now. I would like to have that table vacuumed as often as possible, but right now it only works to do it once per week due to load problems on long-running transactions preventing vacuuming other tables. Are you running 8.1? If so, you can use autovacuum and set per table thresholds (read vacuum aggressivly) and per table cost delay settings so that the performance impact is minimal. If you have tried 8.1 autovacuum and found it unhelpful, I would be curious to find out why. Matt ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] autovacuum
Alvaro Herrera wrote: Chris Browne wrote: It strikes me as a slick idea for autovacuum to take on that behaviour. If the daily backup runs for 2h, then it is quite futile to bother vacuuming a table multiple times during that 2h period when none of the tuples obsoleted during the 2h period will be able to be cleaned out until the end. Hmm, yeah, sounds useful. There's one implementation issue to notice however, and it's that the autovacuum process dies and restarts for each iteration, so there's no way for it to remember previous state unless it's saved somewhere permanent, as the stats info is. However this seems at least slightly redundant with the maintenance window feature -- you could set a high barrier to vacuum during the daily backup period instead. (Anybody up for doing this job?) I can't promise anything, but it's on my list of things to hopefully find time for in the coming months. No way I can start it in Feb, but maybe sometime in March. Anyone else? Matt ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] stats for failed transactions (was Re: [GENERAL] VACUUM
Tom Lane wrote: I'd argue it's fine: there are tons of people using row-level stats via autovacuum, and (AFAICT) just about nobody using 'em for any other purpose. Certainly you never see anyone suggesting them as a tool for investigating problems on pgsql-performance. Sure, it's a repurposing of the stats subsystem, but we should be willing to do that when field experience teaches us what's really needed. OK. But all that we have is *global* counts. Inferring ratios applicable to particular tables seems a big stretch to me. Any given application is likely to have some types of transactions that roll back much more often than others. Certainly a stretch, but it's far from fatal, the worst case scenario is we occasionally vacuum a table that might not need it, whereas the way things stand right now, the worst case is that we never vacuum that might have a lot of slack space. BTW, I'm only arguing this based on what is in the stats system now and pondering how we might improve things if the stats system isn't changed to directly address this problem. One thing we could do is tie the stats message sending more tightly to top-level transaction commit/abort. (It's already effectively true that we send stats only after commit/abort, but we send 'em from the wrong place, ie PostgresMain.) Then the semantics of the message could be read as here's what I did before committing or here's what I did before aborting and the collector could interpret the counts accordingly. However, this still fails in the case where a committed top-level transaction includes some failed subtransactions. I think the only full solution will involve backends doing some extra work at subtransaction commit/abort so that they can report properly classified update counts. Any guess as to the performance implications? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] stats for failed transactions (was Re: [GENERAL] VACUUM
Tom Lane wrote: hmm... That's true. I don't think autovacuum doesn't anything to account for the concept of rolledback inserts. I think this is the fault of the stats system design. AFAICT from a quick look at the code, inserted/updated/deleted tuples are reported to the collector in the same way regardless of whether the sending transaction committed or rolled back. I think this is unquestionably a bug, at least for autovacuum's purposes --- though it might be OK for the original intent of the stats system, which was simply to track activity levels. Any thoughts about how it ought to work? The fact that autovacuum bases it's decisions on info from the stats system is arguably an abuse of the original design. However I don't know of a better source of information at the moment. It has always been my vision that autovacuum will eventually incorporate additional information sources to make better informed decisions. There has always been discussion of using the FSM to help clue us in as to when we need another vacuum. Perhaps the addition of the vacuum space map that people are talking about will also help. None of this directly addresses the question of what the stats system *should* track, but perhaps it is wrongheaded to totally redesign the stats system for the purposes of autovacuum. As a quick semi-fix, perhaps autovacuum should look at the number of rollbacks vs. commits in an attempt to determine the accuracy of the stats. For example if 50% of the transactions are getting rolled back, then autovacuum might include 50% of the inserts in the count towards the vacuum threshold. Obviously this isn't perfect, but it probably gets us closer to reality with the information already available. Thoughts? Matt ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Segv in pg_autovacuum in 8.0.x
daveg wrote: Apologies if this is old news, but pg_autovacuum in 8.0.x has the bad habit of SEGVing and exiting when a table gets dropped out from under it. This creates problems if you rely on pg_autovacuum for the bulk of your vacuuming as it forgets it's statistics when it is restarted and so will skip some desireable vacuums. I have a patch for this that should apply against any 8.0.x if there is any interest. Please post the patch, it might get committed into 8.0.x. I personally haven't heard any other reports of it dieing when tables are dropped, but I don't doubt there are some lingering issues like this. I looked at the new autovacuum in 8.1 and it appears from casual inspection not to have the same problem. The autovacuum in 8.1 is a very different critter. While its design is based on the contrib autovacuum, it is mostly a total rewrite, and probably a vast improvement :-) Matt ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Stuff running slooow
Marc G. Fournier wrote: As a couple of ppl have found out by becoming 'moderators' for the mailing lists, there are *alot* of messages through the server that aren't list subscribers, but are legit emails ... Perhaps that shouldn't be allowed? Would it help things if all non-subscriber emails are just bounced / dropped immediately, before anti-virus etc...Seems this would save a lot of CPU time and more importantly people time reviewing potentially legit emails. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Autovacuum to-do list
All the items you mentioned look like 8.2 issues to me. But here are some thoughts. Alvaro Herrera wrote: * Enable autovacuum by default. Get some field experience with it first, so the worst bugs are covered. (Has anybody tested it?) I have done some testing and it seems to be working ok. I am planning on doing some more in depth tests this week. * Stop a running VACUUM if the system load is too high. What if vacuum used a vacuum delay that was equal to the vacuum delay GUC settings * the system load. Or something more sophisticated but this would have the effect of having vacuum automatically throttle down when the system gets busy and throttle back up when the system gets quiet. We would probably set some min / max values but it sounds interesting, thoughts? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Autovacuum to-do list
Andrew - Supernews wrote: On 2005-08-01, Matthew T. O'Connor matthew@zeut.net wrote: * Stop a running VACUUM if the system load is too high. What if vacuum used a vacuum delay that was equal to the vacuum delay GUC settings * the system load. Or something more sophisticated but this would have the effect of having vacuum automatically throttle down when the system gets busy and throttle back up when the system gets quiet. We would probably set some min / max values but it sounds interesting, thoughts? I'd be very concerned about feedback loop stability; insufficient vacuuming can increase the system load, causing vacuum to get further behind... Right which is why we would need to enforce some max value so that vacuuming will never be totally squeezed out. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Autovacuum loose ends
This is great news! I will do what I can to continue improving the code and address these concerns as best I can. Many of the items below will need to be addressed by Alvaro, but I will comment where I think I have something useful to say :-) Tom Lane wrote: I've applied Alvaro's latest integrated-autovacuum patch. There are still a number of loose ends to be dealt with before beta, though: * Not all the functionality of the current contrib code seems to have made it in. In particular I noted the sleep scaling factor is missing, as well as the options to use nondefault vacuum_cost_delay settings. (I'm not sure how important the sleep scale factor is, but the vacuum cost options seem pretty critical for practical use.) There may be other stuff to move over; Matthew or someone more familiar than I with the contrib version needs to take a look. (I have refrained from removing the contrib module until we're sure we have extracted everything from it.) I will take a look for missing features, thanks for not removing it from contrib yet. As for the sleep factor I'm not sure it makes sense. It was initially put in as a way to prevent autovacuum from running more than X% of the time. However, I think the better answer these days is to use the vacuum delay settings. Speaking of which, I think I mentioned this to Alvaro, but I guess it just didn't make it in. The pg_autovacuum table should have a few additional columns that allow setting vacuum delay settings on a per table basis. I also think that there should be GUC settings for the default autovacuum delay settings which an admin might want to be separate from the system wide default vacuum delay settings. * The code does not make a provision to ignore temporary tables. Although vacuum.c and analyze.c will disregard the request to touch such tables, it'd probably be better to recognize the situation further upstream. In particular it seems that autovacuum will continually throw ANALYZE requests for a temp table due to lack of stats. Does the stats system track data about temp tables? If it doesn't then autovacuum won't try to vacuum them. Will take a look. * ANALYZE also refuses to do anything with pg_statistic itself, which is another case that may need special treatment to avoid useless cycles. Should be easy enough to tell autovacuum to ignore this table specifically. * For that matter I'm unconvinced that it's a good idea to try to force the pgstat DB to pick up every table in every database. If there's no entry it's because the table is not getting modified, and therefore it seems to me that we can just leave well enough alone. The code really is not very good about doing nothing where nothing is called for ;-) I think in a production environment, this won't be an issue, but in a development situation where the postmaster is getting stopped and started fairly often, it could be an issue. Actually, if the stats system doesn't reset it's data on postmaster restart, this shouldn't be a problem. Any thoughts on changing this default? * The code ignores datallowconn and therefore will periodically vacuum template0. I've got mixed emotions about this --- it could save someone's bacon if they failed to properly VACUUM FREEZE a template database, but in 99.99% of installations it's just wasted cycles. Maybe it'd make sense to perform XID-wraparound-prevention vacuuming, but not anything more, in a template DB. Thoughts? Sounds like a good idea. Bacon conservation is clearly one of the goals of autovacuum. * Or actually, it would vacuum template0, except that since no regular backend ever connects to template0, there will be no stats DB entry for it and so the loop in AutoVacMain will ignore it. This is definitely BAD as it means that a database that's not been touched since postmaster start will never be vacuumed, not even for XID wraparound prevention. That test needs to be weakened. * I'm still pretty concerned about the handling of shared catalogs. AFAICS the current pgstats infrastructure simply gets this wrong, meaning that shared catalogs will not get adequate vacuuming. We need to fix that. This was handled in the contrib version by only vacuuming shared catalogs inside template1, however it would then analyze those tables in each and every database. Is there a reason this solution is not adequate? Or perhaps this concept doesn't translate to the integrated version? * As Alvaro noted, the default parameter settings need a lookover. What is in the patch is not what was the default in the contrib module, but the contrib defaults seem awfully passive. Alvaro and I talked about this. I suggested these as the new defaults as there seemed to be a consensus that the defaults in the contrib version were not very useful for most people. Hopefully these defaults still a bit conservative, but useful. * The documentation badly needs work. I committed some
Re: [HACKERS] Autovacuum loose ends
Tom Lane wrote: Matthew T. O'Connor matthew@zeut.net writes: Speaking of which, I think I mentioned this to Alvaro, but I guess it just didn't make it in. The pg_autovacuum table should have a few additional columns that allow setting vacuum delay settings on a per table basis. I also think that there should be GUC settings for the default autovacuum delay settings which an admin might want to be separate from the system wide default vacuum delay settings. I was thinking GUC settings only; is there a real use-case for table-specific delay parameters? ISTM the point of the delay parameters for autovac is to put a lid on its impact on interactive response. Seen in that light, you do not care exactly which table it's hitting at the moment. I was thinking of users that might not want the vacuum delay settings on small tables that will normally be vacuumed very quickly. This isn't a very strong argument, but I thought I should mention it. Also, given the projects tenancy towards not giving knobs to users unless we are sure they need them, I think GUC only would be OK. This was handled in the contrib version by only vacuuming shared catalogs inside template1, however it would then analyze those tables in each and every database. Is there a reason this solution is not adequate? The problem is that now that we've invented the default postgres database, it becomes more plausible to think about installations that haven't got a template1 at all. I'd prefer a solution that does not assume the presence of any specific database. ISTM reasonable to process the shared catalogs symmetrically in every DB: look to see if they need vacuuming or not. The problem (which was also a problem for the contrib version) is that the stats system fails to maintain a single set of stats for a shared catalog --- operations get counted under whichever DB they were issued from. This means that autovac will underestimate the need for vacuuming of a shared catalog, since no matter where it looks from, it will see only a portion of the true update activity. Ok, so without reworking the stats system, I don't see an easy answer to this other than autovacuum trying to sum up all the activity it finds in all the different databases it looks at, but that seems rather ugly. Any thoughts on improving the stats situation here? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PL/pgSQL Debugger Support
Tom Lane wrote: Denis Lussier [EMAIL PROTECTED] writes: I got to thinking it'd be kewl if PgAdmin3 supported an interactive debugger for pl/pgsql. That's been kicked around before, although I don't think anyone wants to tie it to pgAdmin specifically. Check the archives... I didn't find anything relevant after a quick search, but if memory serves, one of the objections to PgAdmin was that it was windows only. This of course is no longer true as of PgAdmin III 1.0. It now support Win32, Linux and FreeBSD. So perhaps that objection is no longer valid. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [PATCHES] default database creation with initdb
Tom Lane wrote: One thing that neither Dave nor I wanted to touch is pg_autovacuum. If that gets integrated into the backend by feature freeze then the question is moot, but if it doesn't then we'll have to decide whether autovac should preferentially connect to template1 or postgres. Neither choice seems real appealing to me: if autovac connects to template1 then it could interfere with CREATE DATABASE, but if it connects to postgres then it could fail if postgres isn't there. Now the latter does not bother me if autovac is considered a client, but it does bother me if autovac is considered part of the backend. I think that template1 and template0 can reasonably be considered special from the point of view of the backend --- but I really don't want postgres to be special in that way. I'm still hoping that autovac will get integrated so this will be moot, but just in case. Perhaps pg_autovacuum should try to connect to the postgres database and if the connection fails, then it will try to connect to template1. This way autovacuum will work whether the postgres database is there or not. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] Autovacuum Patch Status
Sorry to do this on the hackers list, but I have tried to email Alvaro off-list and my email keeps getting bounced so Alvaro, I was just wondering what the current status of your work with the Autovacuum patch is. Also, if you would like to discuss anything and also if I can help you. My time is limited but I can help think thinks through / help with design issues. Also I can probably contribute a few hours of actual coding time this week / this coming weekend. Anyway, I know there was a lot of conversation on the hackers list and I just wanted to see how you were doing. Thanks, Matthew O'Connor ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Autovacuum in the backend
Joshua D. Drake wrote: Josh Berkus wrote: I've personally seen at least a dozen user requests for autovacuum in the backend, and had this conversation about 1,100 times: NB: After a week, my database got really slow. Me: How often are you running VACUUM ANALYZE? NB: Running what? Can't argue that except... RTFM ;). I am not saying it doesn't have a validity. I am just saying that if you actually pay attention to PostgreSQL and maintain it, you don't need it ;) I think everyone on this list would agree with you. The only reason I think the newbie protection is important (and I don't think it's the most important reason for autovacuum) is that perception is reality to some extent. Valid or not we still suffer from a reputation of being more complicated and slower than mysql. Steps towards reducing / eliminating that perception can only be good for us as I think lots of developers make their first database decision based solely on their perceptions and then just stick with what they know. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org