Re: [PATCHES] Load distributed checkpoint V4
On Thu, 19 Apr 2007, Heikki Linnakangas wrote: In the sync phase, we sleep between each fsync until enough time/segments have passed, assuming that the time to fsync is proportional to the file length. I'm not sure that's a very good assumption. I've been making scatter plots of fsync time vs. amount written to the database for a couple of months now, and while there's a trend there it's not a linear one based on data written. Under Linux, to make a useful prediction about how long a fsync will take you first need to consider how much dirty data is already in the OS cache (the "Dirty:" figure in /proc/meminfo) before the write begins, relative to the kernel parameters that control write behavior. Combine that with some knowledge of the caching behavior of the controller/disk combination you're using, and it's just barely possible to make a reasonable estimate. Any less information than all that and you really have very little basis on which to guess how long it's going to take. Other operating systems are going to give completely different behavior here, which of course makes the problem even worse. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Hi, Sorry, because of so many comments/questions, I'll write inline Josh Berkus wrote: Hackers, Writing lots of additional code simply to remove a parameter that *might* be mis-interpreted doesn't sound useful to me, especially when bugs may leak in that way. My take is that this is simple and useful *and* we have it now; other ways don't yet exist, nor will they in time for 8.3. How about naming the parameter wal_compressable? That would indicate pretty clearly that the parameter is intended to be used with wal_compress and nothing else. Hmm, it sounds nicer. However, I do agree with Andreas that anything which adds to WAL volume, even 3%, seems like going in the wrong direction. We already have higher log output than any comparable database (higher than InnoDB by 3x) and we should be looking for output to trim as well as compression. So the relevant question is whether the patch in its current form provides enough benefit to make it worthwhile for 8.3, or whether we should wait for 8.4. Questions: Before answering questions below, I'd like to say that archive log optimization has to be address different point of views to the current (upto 8.2) settings. 1) To deal with partial/inconsisitent write to the data file at crash recovery, we need full page writes at the first modification to pages after each checkpoint. It consumes much of WAL space. 2) 1) is not necessary for archive recovery (PITR) and full page writes can be removed for this purpose. However, we need full page writes during hot backup to deal with partial writes by backup commands. This is implemented in 8.2. 3) To maintain crash recovery chance and reduce the amount of archive log, removal of unnecessary full page writes from archive logs is a good choice. To do this, we need both logical log and full page writes in WAL. I don't think there should be only one setting. It depend on how database is operated. Leaving wal_add_optiomization_info = off default does not bring any change in WAL and archive log handling. I understand some people may not be happy with additional 3% or so increase in WAL size, especially people who dosn't need archive log at all. So I prefer to leave the default off. For users, I think this is simple enough: 1) For people happy with 8.2 settings: No change is needed to move to 8.3 and there's really no change. 2) For people who need to reduce archive log size but like to leave full page writes to WAL (to maintain crash recovery chance): a) Add GUC parameter: wal_add_optiomization_info=on b) Change archive command from "cp" to "pg_compresslog" c) Change restore command from "cp" to "pg_decompresslog" Archive log can be stored and restored as done in older releases. 1) is there any throughput benefit for platforms with fast CPU but contrained I/O (e.g. 2-drive webservers)? Any penalty for servers with plentiful I/O? I've only run benchmarks with archive process running, because wal_add_optimization_info=on does not make sense if we don't archive WAL. In this situation, total I/O decreases because writes to archive log decreases. Because of 3% or so increase in WAL size, there will be increase in WAL write, but decrease in archive writes makes it up. 2) Will this patch make attempts to reduce WAL volume in the future significantly harder? Yes, I'd like to continue to work to reduce the WAL size. It's still an issue when database size becomes several handreds of gigabytes in size. Anyway, I think WAL size reduction has to be done in XLogInsert() or XLogWrite(). We need much more discussion for this. The issue will be how to maintain crash recovery chance by inconsistent writes (by full_page_writes=off, we have to give it up). On the other hand we have to keep examining each WAL record. 3) How is this better than command-line compression for log-shipping? e.g. why do we need it in the database? I don't fully understand what command-line compression means. Simon suggested that this patch can be used with log-shipping and I agree. If we compare compression with gzip or other general purpose compression, compression ratio, CPU usage and I/O by pg_compresslog are all quite better than those in gzip. Please let me know if you intended defferently. Regards; -- - Koichi Suzuki ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Hackers, > Writing lots of additional code simply to remove a parameter that > *might* be mis-interpreted doesn't sound useful to me, especially when > bugs may leak in that way. My take is that this is simple and useful > *and* we have it now; other ways don't yet exist, nor will they in time > for 8.3. How about naming the parameter wal_compressable? That would indicate pretty clearly that the parameter is intended to be used with wal_compress and nothing else. However, I do agree with Andreas that anything which adds to WAL volume, even 3%, seems like going in the wrong direction. We already have higher log output than any comparable database (higher than InnoDB by 3x) and we should be looking for output to trim as well as compression. So the relevant question is whether the patch in its current form provides enough benefit to make it worthwhile for 8.3, or whether we should wait for 8.4. Questions: 1) is there any throughput benefit for platforms with fast CPU but contrained I/O (e.g. 2-drive webservers)? Any penalty for servers with plentiful I/O? 2) Will this patch make attempts to reduce WAL volume in the future significantly harder? 3) How is this better than command-line compression for log-shipping? e.g. why do we need it in the database? -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
> I don't insist the name and the default of the GUC parameter. > I'm afraid wal_fullpage_optimization = on (default) makes > some confusion because the default behavior becomes a bit > different on WAL itself. Seems my wal_fullpage_optimization is not a good name if it caused misinterpretation already :-( > >> Amount of WAL after 60min. run of DBT-2 benchmark > >> wal_add_optimization_info = off (default) 3.13GB > > > > how about wal_fullpage_optimization = on (default) The meaning of wal_fullpage_optimization = on (default) would be the same as your wal_add_optimization_info = off (default). (Reversed name, reversed meaning of the boolean value) It would be there to *turn off* the (default) WAL full_page optimization. For your pg_compresslog it would need to be set to off. "add_optimization_info" sounded like added info about/for some optimization which it is not. We turn off an optimization with the flag for the benefit of an easier pg_compresslog implementation. As already said I would decouple this setting from the part that sets the "removeable full page" flag in WAL, and making the recovery able to skip dummy records. This I would do unconditionally. Andreas ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] fix LOCK_DEBUG
On Mon, 2007-04-23 at 14:22 +0100, Heikki Linnakangas wrote: > 8.2 branch doesn't compile with LOCK_DEBUG enabled Applied, thanks. -Neil ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] fix LOCK_DEBUG
8.2 branch doesn't compile with LOCK_DEBUG enabled because of a missing include. It was added to CVS HEAD in December.. Index: src/backend/utils/misc/guc.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/utils/misc/guc.c,v retrieving revision 1.360 diff -c -r1.360 guc.c *** src/backend/utils/misc/guc.c29 Nov 2006 14:50:07 - 1.360 --- src/backend/utils/misc/guc.c23 Apr 2007 13:21:47 - *** *** 26,31 --- 26,32 #endif + #include "access/transam.h" #include "access/gin.h" #include "access/twophase.h" #include "access/xact.h" -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Dead Space Map version 3 (simplified)
"Hiroki Kataoka" <[EMAIL PROTECTED]> writes: > I think there is no problem. Bloating will make pages including the > unnecessary area which will not be accessed. Soon, those pages will be > registered into DSM. Except the whole point of the DSM is to let us vacuum those pages *before* that happens... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend