Re: [PATCHES] Snapshot management, final
Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > > CopySnapshot always copies snapshots to SnapshotContext, which is a > > context that lives until transaction end. There's no mechanism for > > copying a snapshot into another context, because I don't see the need. > > The only reason we have memory contexts at all is to avoid the need to > track individual palloc'd objects. Since we're instituting exactly such > tracking for snapshots, there's no value in placing them in > general-purpose memory contexts. The problem is that we reuse snapshots, and not all uses have the same longevity. If a context goes away from under a snapshot and there are other references to it, the result is a dangling pointer somewhere. That's why we have reference counts on snaps: we know we can free one when its refcounts are zero. At the same time, the snapshots all go away at transaction end with TopTransactionContext. The other possible approach to this problem is creating a separate copy each time a snapshot is reused, but this just causes extra palloc'ing for no gain at all. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Snapshot management, final
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Simon Riggs wrote: >> OK, so it can;t be copied to a longer lived memory context? > CopySnapshot always copies snapshots to SnapshotContext, which is a > context that lives until transaction end. There's no mechanism for > copying a snapshot into another context, because I don't see the need. The only reason we have memory contexts at all is to avoid the need to track individual palloc'd objects. Since we're instituting exactly such tracking for snapshots, there's no value in placing them in general-purpose memory contexts. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Snapshot management, final
Simon Riggs wrote: > OK, so it can;t be copied to a longer lived memory context? CopySnapshot always copies snapshots to SnapshotContext, which is a context that lives until transaction end. There's no mechanism for copying a snapshot into another context, because I don't see the need. If you need that ability, please explain. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Snapshot management, final
On Tue, 2008-04-22 at 17:50 -0400, Alvaro Herrera wrote: > Simon Riggs wrote: > > On Tue, 2008-04-22 at 15:49 -0400, Alvaro Herrera wrote: > > > > > - Three CopySnapshot call sites remain outside snapmgr.c: DoCopy() on > > > copy.c, ExplainOnePlan() on explain.c and _SPI_execute_plan() on spi.c. > > > They are there because they grab the current ActiveSnapshot, modify it, > > > and then use the resulting snapshot. There is no corresponding > > > FreeSnapshot, because it's not needed. > > > > Not needed? How can we be certain that the modified snapshot does not > > outlive its original source? > > It's not CopySnapshot that's not needed, but FreeSnapshot. The point > here is that the snapshot will be freed automatically as soon as it is > PopActiveSnapshot'd out of existance. CopySnapshot creates a new, > separate copy of the passed snapshot, and each of them will be freed > (separately) as soon as their refcounts reach zero. OK, so it can;t be copied to a longer lived memory context? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Snapshot management, final
Simon Riggs wrote: > On Tue, 2008-04-22 at 15:49 -0400, Alvaro Herrera wrote: > > > - Three CopySnapshot call sites remain outside snapmgr.c: DoCopy() on > > copy.c, ExplainOnePlan() on explain.c and _SPI_execute_plan() on spi.c. > > They are there because they grab the current ActiveSnapshot, modify it, > > and then use the resulting snapshot. There is no corresponding > > FreeSnapshot, because it's not needed. > > Not needed? How can we be certain that the modified snapshot does not > outlive its original source? It's not CopySnapshot that's not needed, but FreeSnapshot. The point here is that the snapshot will be freed automatically as soon as it is PopActiveSnapshot'd out of existance. CopySnapshot creates a new, separate copy of the passed snapshot, and each of them will be freed (separately) as soon as their refcounts reach zero. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Snapshot management, final
On Tue, 2008-04-22 at 15:49 -0400, Alvaro Herrera wrote: > - Three CopySnapshot call sites remain outside snapmgr.c: DoCopy() on > copy.c, ExplainOnePlan() on explain.c and _SPI_execute_plan() on spi.c. > They are there because they grab the current ActiveSnapshot, modify it, > and then use the resulting snapshot. There is no corresponding > FreeSnapshot, because it's not needed. Not needed? How can we be certain that the modified snapshot does not outlive its original source? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
[PATCHES] Snapshot management, final
Here is the patch for snapshot management I currently have. Below is a complete description of the changes in this patch. The most important change is on the use of CopySnapshot and the games with ActiveSnapshot. On the new code, whenever a piece of code needs a snapshot to persist, it is "registered" by using RegisterSnapshot. As soon as it is done with it, it must be unregistered with UnregisterSnapshot. For ActiveSnapshot, we deal with a stack structure. Callers needing an Active snap set call PushActiveSnapshot, and PopActiveSnapshot when they no longer need it. GetSnapshotData still creates initial versions of each snapshot in static memory. There's a new boolean on the SnapshotData struct (set to false by GetSnapshotData), which indicates whether the snapshot has been copied out of static memory. When a snapshot is Registered or PushedActive, this bit is checked: if false, then a copy of the snapshot is made on the dedicated SnapshotContext, and the "copied" flag is flipped. SnapshotData also carries two new reference counts: one for the Register list, another one for the ActiveSnapshot stack. Whenever a snap is Unregistered or Popped, both refcounts are checked. If both are found to be zero then memory can be released. On Unregister and Pop, after deleting the snapshot, we also verify the whole lists. If they are empty, it means no more live snapshot remain for this transaction. In this case we can reset MyProc->xmin to InvalidXid. Thus, GetSnapshotData must now recalculate MyProc->xmin each time it finds xmin to be Invalid, which can not only happen for the serializable snapshot but at any time. A serializable transaction Registers its snapshot as first thing, and keeps it registered until transaction end. Note: I had previously made noises about changing the semantics of MyProc->xmin or introducing a new VacuumXmin. I have refrained from doing so in this patch. Those are the high-level changes. Some coding changes: - SerializableSnapshot and LatestSnapshot as global symbols are no more. - A couple of PG_TRY blocks have been removed. Particularly, on plancache.c this means the do_planning() routine is now pointless and has been folded back into its sole caller. - Three CopySnapshot call sites remain outside snapmgr.c: DoCopy() on copy.c, ExplainOnePlan() on explain.c and _SPI_execute_plan() on spi.c. They are there because they grab the current ActiveSnapshot, modify it, and then use the resulting snapshot. There is no corresponding FreeSnapshot, because it's not needed. - CommandCounterIncrement now calls into snapmgr.c to set the CID of the static snapshots. - CREATE INDEX CONCURRENTLY, VACUUM FULL, and CLUSTER must now explicitely Pop the ActiveSnapshot set by PortalRunUtility before calling CommitTransactionCommand. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. snapshot-6.patch.bz2 Description: Binary data -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Partial match in GIN (next vesrion)
http://www.sigaev.ru/misc/partial_match_gin-0.9.gz Sync with CVS. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Improve shutdown during online backup, take 3
Albe Laurenz wrote: > Simon Riggs wrote: > > Patch applies, and works as described. Looks good for final apply. > > > > Few minor thoughts: > > > > * Text in pg_ctl should be WARNING, not Warning. > > * CancelBackup() API looks strange, not sure why > > * Need to mention that CancelBackup() is not the right way to end a > > backup, so that function and pg_stop_backup should reference > > each other > > > > Other than those, I like it. Very useful patch. > > Thanks for the feedback! > > - I have replaced "Warning" with WARNING". > - I have changed the API of CancelBackup() to return void. > I don't use the return code anyway, and I guess it's less confusing > and "strange" that way. > - I have added comments to disambiguate pg_stop_backup() and > CancelBackup(). > > CancelBackup now writes a message to the server log if it cannot > delete backup_label - I hope that's not too verbose... This doesn't look like our normal coding standards, and should probably be changed: + if (0 != stat(BACKUP_LABEL_FILE, &stat_buf)) (there's a number of similar places) //Magnus -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Removing NONSEG mode
Zdenek Kotala wrote: > I attach patch which remove nonsegment mode support. It was discussed during > last commit fest. Nonsegment mode is possible uses only on couple of FS (ZFS, > XFS) and it is not safe on any OS because each OS support more filesystems. Added to May commitfest, thanks. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Improve shutdown during online backup, take 3
Simon Riggs wrote: > Patch applies, and works as described. Looks good for final apply. > > Few minor thoughts: > > * Text in pg_ctl should be WARNING, not Warning. > * CancelBackup() API looks strange, not sure why > * Need to mention that CancelBackup() is not the right way to end a > backup, so that function and pg_stop_backup should reference > each other > > Other than those, I like it. Very useful patch. Thanks for the feedback! - I have replaced "Warning" with WARNING". - I have changed the API of CancelBackup() to return void. I don't use the return code anyway, and I guess it's less confusing and "strange" that way. - I have added comments to disambiguate pg_stop_backup() and CancelBackup(). CancelBackup now writes a message to the server log if it cannot delete backup_label - I hope that's not too verbose... Yours, Laurenz Albe backup-shut.patch Description: backup-shut.patch backup-shut.doc.patch Description: backup-shut.doc.patch -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
[PATCHES] Removing NONSEG mode
I attach patch which remove nonsegment mode support. It was discussed during last commit fest. Nonsegment mode is possible uses only on couple of FS (ZFS, XFS) and it is not safe on any OS because each OS support more filesystems. I added RELSEG option to the configure script to allow easily compile with different segment size (on most filesystem 1T is safe value). As a bonus I added also BLCKSZ to configure script. It is not important for this patch but it could be useful e.g. for buildfarm testing with different BLCKSZ. Patch requires to run autoconf and autoheader. Zdenek PS: --with-segsize=1/1024 allows set segsize to 1MB - good for testing Index: configure.in === RCS file: /zfs_data/cvs_pgsql/cvsroot/pgsql/configure.in,v retrieving revision 1.555 diff -c -r1.555 configure.in *** configure.in 30 Mar 2008 04:08:14 - 1.555 --- configure.in 21 Apr 2008 15:19:59 - *** *** 220,233 # # Data file segmentation # ! PGAC_ARG_BOOL(enable, segmented-files, yes, ! [ --disable-segmented-files disable data file segmentation (requires largefile support)]) # # C compiler # # For historical reasons you can also use --with-CC to specify the C compiler # to use, although the standard way to do this is to set the CC environment # variable. PGAC_ARG_REQ(with, CC, [], [CC=$with_CC]) --- 220,287 # # Data file segmentation # ! AC_MSG_CHECKING([for default relation segment size]) ! PGAC_ARG_REQ(with, segsize, [ --with-segsize=RELSEG_SIZE change default relation segment size in GB [[1]]], ! [default_segsize=$withval], ! [default_segsize=1]) ! AC_MSG_RESULT([${default_segsize}GB]) ! AC_DEFINE_UNQUOTED([RELSEG_SIZE], 1024*1024*1024LL*${default_segsize}/BLCKSZ, [ ! RELSEG_SIZE is the maximum number of blocks allowed in one disk ! file. Thus, the maximum size of a single file is RELSEG_SIZE * BLCKSZ; ! relations bigger than that are divided into multiple files. ! ! RELSEG_SIZE * BLCKSZ must be less than your OS' limit on file size. ! This is often 2 GB or 4GB in a 32-bit operating system, unless you ! have large file support enabled. By default, we make the limit 1 ! GB to avoid any possible integer-overflow problems within the OS. ! A limit smaller than necessary only means we divide a large ! relation into more chunks than necessary, so it seems best to err ! in the direction of a small limit. (Besides, a power-of-2 value ! saves a few cycles in md.c.) + Changing RELSEG_SIZE requires an initdb. + ]) + AC_SUBST(default_segsize) + + # + # Block size # + AC_MSG_CHECKING([for default block size]) + PGAC_ARG_REQ(with, blocksize, [ --with-blocksize=BLCKSZ change default block size (1,2,4,8,16,32 are allowed values). [[8]]], + [default_blocksize=$withval], + [default_blocksize=8]) + case ${default_blocksize} in + 1) default_blocksize=1024;; + 2) default_blocksize=2048;; + 4) default_blocksize=4096;; + 8) default_blocksize=8192;; + 16) default_blocksize=16384;; + 32) default_blocksize=32768;; + *) AC_MSG_ERROR([Invalid block size. Allowed values are 1,2,4,8,16,32.]) + esac + + AC_MSG_RESULT([${default_blocksize}B]) + AC_DEFINE_UNQUOTED([BLCKSZ], ${default_blocksize}, [ + Size of a disk block --- this also limits the size of a tuple. You + can set it bigger if you need bigger tuples (although TOAST should + reduce the need to have large tuples, since fields can be spread + across multiple tuples). + + BLCKSZ must be a power of 2. The maximum possible value of BLCKSZ + is currently 2^15 (32768). This is determined by the 15-bit widths + of the lp_off and lp_len fields in ItemIdData (see + include/storage/itemid.h). + + Changing BLCKSZ requires an initdb. + ]) + AC_SUBST(default_blocksize) + + # C compiler # # For historical reasons you can also use --with-CC to specify the C compiler + # to use, although the standard way to do this is to set the CC environment # variable. PGAC_ARG_REQ(with, CC, [], [CC=$with_CC]) *** *** 1435,1443 # Check for largefile support (must be after AC_SYS_LARGEFILE) AC_CHECK_SIZEOF([off_t]) ! ! if test "$ac_cv_sizeof_off_t" -lt 8 -o "$enable_segmented_files" = "yes"; then ! AC_DEFINE([USE_SEGMENTED_FILES], 1, [Define to split data files into 1GB segments.]) fi # SunOS doesn't handle negative byte comparisons properly with +/- return --- 1489,1496 # Check for largefile support (must be after AC_SYS_LARGEFILE) AC_CHECK_SIZEOF([off_t]) ! if test "$ac_cv_sizeof_off_t" -lt 8 -a "$default_segsize" != "1"; then !AC_MSG_ERROR([Large file support is not enabled. Segment size cannot be larger then 1GB.]) fi # SunOS doesn't handle negative byte comparisons properly with +/- return Index: src/backend/storage/file/buffile.c ==