Re: [HACKERS] Core Extensions relocation
On tor, 2011-06-09 at 00:14 -0400, Greg Smith wrote: > Following up on the idea we've been exploring for making some > extensions > more prominent, attached is the first rev that I think may be worth > considering seriously. Main improvement from the last is that I > reorganized the docs to break out what I decided to tentatively name > "Core Extensions" into their own chapter. No longer mixed in with > the > rest of the contrib modules, and I introduce them a bit > differently. For the directory name, I'd prefer either src/extensions (since there is more than one), or if you want to go for short somehow, src/ext. (Hmm, I guess the installation subdirectory is also called "extension". But it felt wrong on first reading anyway.) There is some funny business in your new src/extension/Makefile. You apparently based this on a very old version of contrib/Makefile (if still contains a CVS keyword header), it uses for loops in make targets after we just got rid of them, and it references some modules that aren't there at all. That file needs a complete redo based on current sources, I think. Equally, your new extension-global.mk sets MODULEDIR, which is no longer necessary, and has a CVS header. What version did you branch this off? :) Perhaps a small addition to the installation instructions would also be appropriate, to tell people that certain core extensions, as it were, are installed by default. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] procpid?
On Jun 9, 2011, at 11:29 AM, Robert Haas wrote: > On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian wrote: >> Can someone explain why pg_stat_activity has a column named procpid and >> not simply pid? 'pid' is that pg_locks uses, and 'procpid' is redundant >> (proc-process-id). A mistake? > > Well, we refer to the slots that backends use as "procs" (really > PGPROC), so I'm guessing that this was intended to mean "the pid > associated with the proc". It might not be the greatest name but I > can't see changing it now. It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so we can deprecate procpid and eventually remove it... -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] literature on write-ahead logging
On Jun 9, 2011, at 9:28 AM, Robert Haas wrote: > On Thu, Jun 9, 2011 at 10:22 AM, Alvaro Herrera > wrote: >>> 1. Subdivide XLOG insertion into three operations: (1) allocate space >>> in the log buffer, (2) copy the log records into the allocated space, >>> and (3) release the space to the buffer manager for eventual write to >>> disk. AIUI, WALInsertLock currently covers all three phases of this >>> operation, but phase 2 can proceed in parallel. It's pretty easy to >>> imagine maintain one pointer that references the next available byte >>> of log space (let's call this the "next insert" pointer), and a second >>> pointer that references the byte following the last byte known to be >>> written (let's call this the "insert done" pointer). >> >> I think this can be done more simply if instead of a single "insert >> done" pointer you have an array of them, one per backend; there's also a >> global pointer that can be advanced per the minimum of the bunch, which >> you can calculate with some quick locking of the array. You don't need >> to sleep at all, except to update the array and calculate the global >> ptr, so this is probably also faster. > > I think looping over an array with one entry per backend is going to > be intolerably slow... but it's possible I'm wrong. If the global pointer is just tracking the minimum of the array entries, do you actually have to lock the entire array? The global pointer is going to be "behind" soon enough anyway, so why does it need a static view of the entire array? ISTM you only need to ensure that updating individual elements in the array is atomic. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small SSI issues
On Fri, Jun 10, 2011 at 09:43:58PM +0300, Heikki Linnakangas wrote: > > Do checks such as that argue for keeping the volatile flag, or do > > you think we can drop it if we make those changes? (That would also > > allow dropping a number of casts which exist just to avoid > > warnings.) > > I believe we can drop it, I'll double-check. Yes, dropping it seems like the thing to do. It's been on my list for a while. We are not really getting anything out of declaring it volatile since we cast the volatile qualifier away most of the time. Dan -- Dan R. K. Ports MIT CSAILhttp://drkp.net/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gcc 4.6 and hot standby
On Fri, Jun 10, 2011 at 14:24, Tom Lane wrote: > Alex Hunsaker writes: >> Hrm, Couldn't we change all the references to tmpRecPtr to use RecPtr >> instead? (Except of course where we assign RecPtr = &tmpRecPtr); I >> think that would make the code look a lot less confused. Something >> like the attached? > > Yeah, we could do that too; slightly modified version of your change > included in the attached. A cassert enabled build seems to be working, ill leave it running over the weekend... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gcc 4.6 and hot standby
Alex Hunsaker writes: > On Fri, Jun 10, 2011 at 12:38, Tom Lane wrote: >> I think we need a workaround. My second idea about moving the test up doesn't work, because we can't know the page header size until after we've read the page. But I've verified that the attached patch does make the problem go away on my F15 box. > Hrm, Couldn't we change all the references to tmpRecPtr to use RecPtr > instead? (Except of course where we assign RecPtr = &tmpRecPtr); I > think that would make the code look a lot less confused. Something > like the attached? Yeah, we could do that too; slightly modified version of your change included in the attached. regards, tom lane diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 5c3ca479fb33fff646e3a7b08b53efea92b9a97f..aa0b0291ee1c7781a36c62e3d89abbc98d3b8499 100644 *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** ReadRecord(XLogRecPtr *RecPtr, int emode *** 3728,3750 RecPtr = &tmpRecPtr; /* ! * Align recptr to next page if no more records can fit on the current ! * page. */ if (XLOG_BLCKSZ - (RecPtr->xrecoff % XLOG_BLCKSZ) < SizeOfXLogRecord) ! { ! NextLogPage(tmpRecPtr); ! /* We will account for page header size below */ ! } ! if (tmpRecPtr.xrecoff >= XLogFileSize) { ! (tmpRecPtr.xlogid)++; ! tmpRecPtr.xrecoff = 0; } } else { if (!XRecOffIsValid(RecPtr->xrecoff)) ereport(PANIC, (errmsg("invalid record offset at %X/%X", --- 3728,3759 RecPtr = &tmpRecPtr; /* ! * RecPtr is pointing to end+1 of the previous WAL record. We must ! * advance it if necessary to where the next record starts. First, ! * align to next page if no more records can fit on the current page. */ if (XLOG_BLCKSZ - (RecPtr->xrecoff % XLOG_BLCKSZ) < SizeOfXLogRecord) ! NextLogPage(*RecPtr); ! /* Check for crossing of xlog segment boundary */ ! if (RecPtr->xrecoff >= XLogFileSize) { ! (RecPtr->xlogid)++; ! RecPtr->xrecoff = 0; } + + /* + * If at page start, we must skip over the page header. But we can't + * do that until we've read in the page, since the header size is + * variable. + */ } else { + /* + * In this case, the passed-in record pointer should already be + * pointing to a valid record starting position. + */ if (!XRecOffIsValid(RecPtr->xrecoff)) ereport(PANIC, (errmsg("invalid record offset at %X/%X", *** retry: *** 3773,3783 if (targetRecOff == 0) { /* ! * Can only get here in the continuing-from-prev-page case, because ! * XRecOffIsValid eliminated the zero-page-offset case otherwise. Need ! * to skip over the new page's header. */ ! tmpRecPtr.xrecoff += pageHeaderSize; targetRecOff = pageHeaderSize; } else if (targetRecOff < pageHeaderSize) --- 3782,3794 if (targetRecOff == 0) { /* ! * At page start, so skip over page header. The Assert checks that ! * we're not scribbling on caller's record pointer; it's OK because we ! * can only get here in the continuing-from-prev-record case, since ! * XRecOffIsValid rejected the zero-page-offset case otherwise. */ ! Assert(RecPtr == &tmpRecPtr); ! RecPtr->xrecoff += pageHeaderSize; targetRecOff = pageHeaderSize; } else if (targetRecOff < pageHeaderSize) diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h index eeccdce31d076322cc5431117c6705b839b1b162..7e39630c1bf5d7cbf1a721b641a9481069e92816 100644 *** a/src/include/access/xlog_internal.h --- b/src/include/access/xlog_internal.h *** typedef XLogLongPageHeaderData *XLogLong *** 154,166 /* Align a record pointer to next page */ #define NextLogPage(recptr) \ do { \ ! if (recptr.xrecoff % XLOG_BLCKSZ != 0) \ ! recptr.xrecoff += \ ! (XLOG_BLCKSZ - recptr.xrecoff % XLOG_BLCKSZ); \ ! if (recptr.xrecoff >= XLogFileSize) \ { \ ! (recptr.xlogid)++; \ ! recptr.xrecoff = 0; \ } \ } while (0) --- 154,166 /* Align a record pointer to next page */ #define NextLogPage(recptr) \ do { \ ! if ((recptr).xrecoff % XLOG_BLCKSZ != 0) \ ! (recptr).xrecoff += \ ! (XLOG_BLCKSZ - (recptr).xrecoff % XLOG_BLCKSZ); \ ! if ((recptr).xrecoff >= XLogFileSize) \ { \ ! ((recptr).xlogid)++; \ ! (recptr).xrecoff = 0; \ } \ } while (0) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gcc 4.6 and hot standby
On Fri, Jun 10, 2011 at 12:38, Tom Lane wrote: > > I've been able to reproduce this on released Fedora 15, and sure enough > it is a compiler bug. The problem comes from these fragments of > ReadRecord(): > [ ... ] Whoa, awesome. I spent a few more hours comparing the assembly-- then I tried compiling a subset of xlog.c with some educated guesses as to what function might be getting mis-compiled. Obviously my guesses were not educated enough! :-) > I've filed a bug report for this: > https://bugzilla.redhat.com/show_bug.cgi?id=712480 > but I wouldn't care to hold my breath while waiting for a fix to appear > upstream, let alone propagate to all the distros already using 4.6.0. I wouldn't hold my breath either. > I think we need a workaround. > > The obvious question to ask here is why are we updating > "tmpRecPtr.xrecoff" and not "RecPtr->xrecoff"? The latter choice would > be a lot more readable, since the immediately surrounding code doesn't > refer to tmpRecPtr. I think the idea is to guarantee that the caller's > referenced record pointer (if any) isn't modified, but if RecPtr is not > pointing at tmpRecPtr here, we have got big problems anyway. Hrm, Couldn't we change all the references to tmpRecPtr to use RecPtr instead? (Except of course where we assign RecPtr = &tmpRecPtr); I think that would make the code look a lot less confused. Something like the attached? FYI Im happy to test whatever you fix you propose for a few days on this machine. It gets a fair amount of traffic... hopefully enough to exercise some possible corner cases. *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** *** 3681,3694 ReadRecord(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt) */ if (XLOG_BLCKSZ - (RecPtr->xrecoff % XLOG_BLCKSZ) < SizeOfXLogRecord) { ! NextLogPage(tmpRecPtr); /* We will account for page header size below */ } ! if (tmpRecPtr.xrecoff >= XLogFileSize) { ! (tmpRecPtr.xlogid)++; ! tmpRecPtr.xrecoff = 0; } } else --- 3681,3694 */ if (XLOG_BLCKSZ - (RecPtr->xrecoff % XLOG_BLCKSZ) < SizeOfXLogRecord) { ! NextLogPage(RecPtr); /* We will account for page header size below */ } ! if (RecPtr->xrecoff >= XLogFileSize) { ! (RecPtr->xlogid)++; ! RecPtr->xrecoff = 0; } } else *** *** 3725,3731 retry: * XRecOffIsValid eliminated the zero-page-offset case otherwise. Need * to skip over the new page's header. */ ! tmpRecPtr.xrecoff += pageHeaderSize; targetRecOff = pageHeaderSize; } else if (targetRecOff < pageHeaderSize) --- 3725,3731 * XRecOffIsValid eliminated the zero-page-offset case otherwise. Need * to skip over the new page's header. */ ! RecPtr->xrecoff += pageHeaderSize; targetRecOff = pageHeaderSize; } else if (targetRecOff < pageHeaderSize) *** a/src/include/access/xlog_internal.h --- b/src/include/access/xlog_internal.h *** *** 154,166 typedef XLogLongPageHeaderData *XLogLongPageHeader; /* Align a record pointer to next page */ #define NextLogPage(recptr) \ do { \ ! if (recptr.xrecoff % XLOG_BLCKSZ != 0) \ ! recptr.xrecoff += \ ! (XLOG_BLCKSZ - recptr.xrecoff % XLOG_BLCKSZ); \ ! if (recptr.xrecoff >= XLogFileSize) \ { \ ! (recptr.xlogid)++; \ ! recptr.xrecoff = 0; \ } \ } while (0) --- 154,166 /* Align a record pointer to next page */ #define NextLogPage(recptr) \ do { \ ! if (recptr->xrecoff % XLOG_BLCKSZ != 0) \ ! recptr->xrecoff += \ ! (XLOG_BLCKSZ - recptr->xrecoff % XLOG_BLCKSZ); \ ! if (recptr->xrecoff >= XLogFileSize) \ { \ ! (recptr->xlogid)++; \ ! recptr->xrecoff = 0; \ } \ } while (0) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Start new timeline for PITR
On 6/10/11 12:34 PM, Robert Haas wrote: > On Fri, Jun 10, 2011 at 2:53 PM, Josh Berkus wrote: >>> Let's imagine we're taking filesystem snapshots each day by whatever >>> means. We're also archiving xlogs, but only have space for 48 hours' >>> worth. Now we want to recover to 3 days ago, but there are no WALs >>> from that time, so we do a crash recovery from the filesystem >>> snapshot. Doing continuous archiving from this conflicts with the >>> existing WALs, which we solve by creating a new timeline. >> >> How is this different from just changing the recovery_command? > > *scratches head* > > How is it the same? Well, presumably I can just change recovery_command to recover from an empty directory. Then the PITR copy will just come up as soon as it finishes processing local snapshot WAL, and it'll start its own timeline. How is this different from what the patch does? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Start new timeline for PITR
On 10.06.2011 22:34, Robert Haas wrote: On Fri, Jun 10, 2011 at 2:53 PM, Josh Berkus wrote: Let's imagine we're taking filesystem snapshots each day by whatever means. We're also archiving xlogs, but only have space for 48 hours' worth. Now we want to recover to 3 days ago, but there are no WALs from that time, so we do a crash recovery from the filesystem snapshot. Doing continuous archiving from this conflicts with the existing WALs, which we solve by creating a new timeline. How is this different from just changing the recovery_command? *scratches head* How is it the same? Creating a dummy recovery.conf with bogus recovery_command would do the trick. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Start new timeline for PITR
On Fri, Jun 10, 2011 at 2:53 PM, Josh Berkus wrote: >> Let's imagine we're taking filesystem snapshots each day by whatever >> means. We're also archiving xlogs, but only have space for 48 hours' >> worth. Now we want to recover to 3 days ago, but there are no WALs >> from that time, so we do a crash recovery from the filesystem >> snapshot. Doing continuous archiving from this conflicts with the >> existing WALs, which we solve by creating a new timeline. > > How is this different from just changing the recovery_command? *scratches head* How is it the same? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump vs malloc
Magnus Hagander writes: > I came across a situation today with a pretty bad crash of pg_dump, > due to not checking the return code from malloc(). When looking > through the code, it seems there are a *lot* of places in pg_dump that > doesn't check the malloc return code. > But we do have a pg_malloc() function in there - but from what I can > tell it's only used very sparsely? > Shouldn't we be using that one more or less everywhere Yup. Have at it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Start new timeline for PITR
David, > Let's imagine we're taking filesystem snapshots each day by whatever > means. We're also archiving xlogs, but only have space for 48 hours' > worth. Now we want to recover to 3 days ago, but there are no WALs > from that time, so we do a crash recovery from the filesystem > snapshot. Doing continuous archiving from this conflicts with the > existing WALs, which we solve by creating a new timeline. How is this different from just changing the recovery_command? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Denormal float values break backup/restore
Marti Raudsepp writes: > On Fri, Jun 10, 2011 at 17:20, Tom Lane wrote: >> I put this right about on par with the occasional suggestions that we >> implement our own filesystem. > I am worried that legitimate calculations can bring the database into > a state where a backup succeeds, but is no longer restorable. > Obviously making reliable backups is the job of a database. Yes, but my point is that Postgres cannot be held accountable for every misfeature of the platforms we are using. We are not replacing libc any more than we are replacing the filesystem, or the TCP stack, or several other components that people have trouble with from time to time. We don't have the manpower, nor the expertise, to take responsibility for all that stuff. > I see four ways to make this work: > 1. Clamp denormal numbers to 0.0 when stored into a table. > 2. Throw an error when denormal numbers are stored into a table. > 3. Use the FTZ (flush-to-zero) FPU mode so denormal values never > appear. I'm not sure whether this is available on all target > architectures. > 4. Change the float4in and float8in functions to accept denormal float > literals. This has a nice side-effect of enabling full IEEE 754 > floating point range. Or (5) Lobby your libc providers to make strtod accept denormals without throwing ERANGE. There is no obvious reason why it shouldn't. (On at least one of my boxes, it does.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Start new timeline for PITR
David, > Let's imagine we're taking filesystem snapshots each day by whatever > means. We're also archiving xlogs, but only have space for 48 hours' > worth. Now we want to recover to 3 days ago, but there are no WALs > from that time, so we do a crash recovery from the filesystem > snapshot. Doing continuous archiving from this conflicts with the > existing WALs, which we solve by creating a new timeline. How is this different from just changing the recovery_command? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small SSI issues
On 10.06.2011 18:05, Kevin Grittner wrote: Heikki Linnakangas wrote: o There is no safeguard against actually wrapping around the SLRU, just the warning Any thoughts on what we should do instead? If someone holds open a transaction long enough to burn through a billion transaction IDs (or possibly less if someone uses a smaller BLCKSZ), should we generate a FATAL error? FATAL is a bit harsh, ERROR seems more appropriate. o I'm suspicious of the OldSerXidPagePrecedesLogically() logic with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger than necessary to cover the whole range of 2^32 transactions, so at high XIDs, say 2^32-1, doesn't it incorrectly think that low XIDs, say 10, are in the past, not the future? I will look that over to see; but if this is broken, then one of the other SLRU users is probably broken, because I think I stole this code pretty directly from another spot. It was copied from async.c, which doesn't have this problem because it's not mapping XIDs to the slru. There, the max queue size is determined by the *_MAX_PAGE, and you point to a particular location in the SLRU with simply page+offset. /* * Keep a pointer to the currently-running serializable * transaction (if any) for quick reference. * TODO SSI: Remove volatile qualifier and the then-unnecessary * casts? */ static volatile SERIALIZABLEXACT *MySerializableXact = InvalidSerializableXact; * So, should we remove it? In most places where contents of MySerializableXact are modified, we're holding SerializableXactHashLock in exclusive mode. However, there's two exceptions: o GetSafeSnapshot() modifies MySerializableXact->flags without any lock. It also inspects MySerializableXact->possibleUnsafeConflicts without a lock. What if somone sets some other flag in the flags bitmap just when GetSafeSnapshot() is about to set SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost :-(. o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE without holding a lock. The same danger is here if someone else tries to set some other flag concurrently. I think we should simply acquire SerializableXactHashLock in GetSafeSnapshot(). It shouldn't be so much of a hotspot that it would make any difference in performance. CheckForSerializableConflictIn() might be called a lot, however, so maybe we need to check if the flag is set first, and only acquire the lock and set it if it's not set already. OK. Do checks such as that argue for keeping the volatile flag, or do you think we can drop it if we make those changes? (That would also allow dropping a number of casts which exist just to avoid warnings.) I believe we can drop it, I'll double-check. I'm happy to work on modifications for any of this or to stay out of your way if you want to. Should I put together a patch for those items where we seem to agree and have a clear way forward? I'll fix the MySerializableXact locking issue, and come back to the other issues on Monday. If you have the time and energy to write a patch by then, feel free, but I can look into them otherwise. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gcc 4.6 and hot standby
Mark Kirkwood writes: > On 09/06/11 06:58, Alex Hunsaker wrote: >> Yeah :-). However ill note it looks like its the default compiler for >> fedora 15, ubuntu natty and debian sid. > FWIW Ubuntu natty uses gcc 4.5.2, probably just as as well in the light > of your findings :-) I've been able to reproduce this on released Fedora 15, and sure enough it is a compiler bug. The problem comes from these fragments of ReadRecord(): ReadRecord(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt) { XLogRecPtrtmpRecPtr = EndRecPtr; if (RecPtr == NULL) RecPtr = &tmpRecPtr; targetRecOff = RecPtr->xrecoff % XLOG_BLCKSZ; if (targetRecOff == 0) tmpRecPtr.xrecoff += pageHeaderSize; record = (XLogRecord *) ((char *) readBuf + RecPtr->xrecoff % XLOG_BLCKSZ); gcc 4.6.0 is assuming that the value it first fetches for RecPtr->xrecoff is still usable for computing the "record" pointer, despite the possible intervening update of tmpRecPtr. That of course leads to "record" pointing to the wrong place, which results in an incorrect conclusion that we're looking at an invalid record header, which leads to killing and restarting the walreceiver. I haven't traced what happens after that, but apparently in standby mode we'll come back to ReadRecord with a record pointer that's already advanced over the page header, else it'd be an infinite loop. Note that this means that crash recovery, not only standby mode, is broken with this compiler. I've filed a bug report for this: https://bugzilla.redhat.com/show_bug.cgi?id=712480 but I wouldn't care to hold my breath while waiting for a fix to appear upstream, let alone propagate to all the distros already using 4.6.0. I think we need a workaround. The obvious question to ask here is why are we updating "tmpRecPtr.xrecoff" and not "RecPtr->xrecoff"? The latter choice would be a lot more readable, since the immediately surrounding code doesn't refer to tmpRecPtr. I think the idea is to guarantee that the caller's referenced record pointer (if any) isn't modified, but if RecPtr is not pointing at tmpRecPtr here, we have got big problems anyway. So I'm tempted to propose this fix: if (targetRecOff == 0) { /* * Can only get here in the continuing-from-prev-page case, because * XRecOffIsValid eliminated the zero-page-offset case otherwise. Need * to skip over the new page's header. */ - tmpRecPtr.xrecoff += pageHeaderSize; + Assert(RecPtr == &tmpRecPtr); + RecPtr->xrecoff += pageHeaderSize; targetRecOff = pageHeaderSize; } Another possibility, which might be less ugly, is to delete the above code entirely and handle the page-header case in the RecPtr == NULL branch a few lines above. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_dump vs malloc
I came across a situation today with a pretty bad crash of pg_dump, due to not checking the return code from malloc(). When looking through the code, it seems there are a *lot* of places in pg_dump that doesn't check the malloc return code. But we do have a pg_malloc() function in there - but from what I can tell it's only used very sparsely? Shouldn't we be using that one more or less everywhere, or even #define it? Or am I missing something in the code here? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TG_DEPTH patch v1
Florian Pflug wrote: > Here is review of the patch. Thanks for the review. I think I'd better try to keep the decks clear for dealing with any SSI issues which may surface during the coming month, so I'm going to mark this patch "Returned with Feedback" and look at this for possible resubmission in light of your review in a later CF. On resubmit I will start with a reference to your review and proceed with discussion from there. Thanks, -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: collect frequency statistics for arrays
On Mon, May 23, 2011 at 6:54 PM, Alexander Korotkov wrote: > Second version of patch attached. Main changes: > 1) ANY and ALL keywords handling. > 2) Collecting statistics of array length. It is used in "column <@ const". > 3) Relatively accurate estimation of "column <@ const" selectivity. This > estimation is most hard part of patch. I'm warrying that there could be too > expensibe calculations for estimation. Also, likely current comments don't > clearify anything... Just had a brief look at this ahead of starting review. Initial comments are that the code is well structured and I doubt there will be problems at the code level. Looks like a good patch. At the moment I see no tests. If this code will be exercised by existing tests then you should put some notes with the patch to explain that, or at least provide some pointers as to how I might test this. Also, I'd like to see some more explanation. Either in comments, or just as a post to hackers. That saves me time, but we need to be clear about what this does and does not do, what it might do in the future etc.. 3+ years from now we need to be able to remember what the code was supposed to do. You will forget yourself in time, if you write enough patches. Based on this, I think you'll be writing quite a few more. And of course, a few lines for the docs also. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Operator families vs. casts
Alexey, On Fri, Jun 10, 2011 at 05:46:42PM +0300, Alexey Klyukin wrote: > Providing my thoughts on the 'mundane' question first. I was actually referring to this paragraph: The standing code handled index/constraint dependencies of changing columns by extracting the SQL definition using pg_get_indexdef or pg_get_constraintdef, dropping the object, and recreating it afresh. To implement this optimization for indexes and index-backed constraints, we need to update the index definition without perturbing its storage. I found two major ways to do this, and I'm unsure which will be preferable, so I'm attaching both as alternate implementations of the same outcome. I'd appreciate feedback on which is preferable. The first skips the drop and updates pg_index.indclass, pg_attribute, and pg_constraint.conexclop. The alternate patch retains the drop and create, then resurrects the old relfilenode and assigns it to the new object. The second approach is significantly simpler and smaller, but it seems less-like anything else we do. As a further variation on the second approach, I also considered drilling holes through the performDeletion() and DefineIndex() stacks to avoid the drop-and-later-preserve dynamic, but that seemed uglier. However, while we're on the topic you looked at: > Here's the relevant part of the original post: > > > ATPostAlterTypeCleanup has this comment: > > /* > > * Re-parse the index and constraint definitions, and attach them > > to the > > * appropriate work queue entries. We do this before dropping > > because in > > * the case of a FOREIGN KEY constraint, we might not yet have > > exclusive > > * lock on the table the constraint is attached to, and we need to > > get > > * that before dropping. It's safe because the parser won't > > actually look > > * at the catalogs to detect the existing entry. > > */ > > Is the second sentence true? I don't think so, so I deleted it for now. > > Here > > is the sequence of lock requests against the table possessing the FOREIGN > > KEY > > constraint when we alter the parent/upstream column: > > > > transformAlterTableStmt - ShareRowExclusiveLock > > ATPostAlterTypeParse - lockmode of original ALTER TABLE > > RemoveTriggerById() for update trigger - ShareRowExclusiveLock > > RemoveTriggerById() for insert trigger - ShareRowExclusiveLock > > RemoveConstraintById() - AccessExclusiveLock > > CreateTrigger() for insert trigger - ShareRowExclusiveLock > > CreateTrigger() for update trigger - ShareRowExclusiveLock > > RI_Initial_Check() - AccessShareLock (3x) > > I think the statement in the second sentence of the comment is true. > ATPostAlterTypeParse is called only from ATPostAlterTypeCleanup. It has to > grab the lock on the table the constraint is attached to before dropping the > constraint. What it does is it opens that relation with the same lock that is > grabbed for AT_AlterColumnType subtype, i.e. AccessExclusiveLock. I think > there is no preceding place in AlterTable chain, that grabs stronger lock on > this relation. ShareRowExclusiveLock is taken in transformAlterTableStmt (1st > function in your sequence), but this function is ultimately called from > ATPostAlterTypeParse, just before the latter grabs the AccessExclusiveLock, so > ultimately at the point where this comment is located no locks are taken for > the table with a FOREIGN KEY constraint. The comment is correct that we don't yet have a lock on the remote table at that point. But why do we need a lock before RemoveTriggerById() acquires one? True, it is bad form to get a ShareRowExclusiveLock followed by upgrading to an AccessExclusiveLock, but the solution there is that RemoveConstraintById() only needs a ShareRowExclusiveLock. Granted, in retrospect, I had no business editorializing on this matter. It's proximate to the patch's changes but unrelated to them. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Denormal float values break backup/restore
On Fri, Jun 10, 2011 at 17:20, Tom Lane wrote: > I put this right about on par with the occasional suggestions that we > implement our own filesystem. It's not our core competency and in the > end there is no value-add. If you need to work with denormals, find > a platform that supports them better. Sorry if I wasn't clear on this. I don't care whether PostgreSQL supports denormal float calculations or not. I know PostgreSQL doesn't offer IEEE 754 floating point semantics. I am worried that legitimate calculations can bring the database into a state where a backup succeeds, but is no longer restorable. Obviously making reliable backups is the job of a database. Case in point: % psql -c 'create table foo as select 0.25e-307::float8/2 as val' SELECT 1 % pg_dump -t foo > foo.sql % psql -c 'drop table foo' % psql < foo.sql ... ERROR: "1.24964e-308" is out of range for type double precision CONTEXT: COPY foo, line 1, column val: "1.24964e-308" I see four ways to make this work: 1. Clamp denormal numbers to 0.0 when stored into a table. 2. Throw an error when denormal numbers are stored into a table. 3. Use the FTZ (flush-to-zero) FPU mode so denormal values never appear. I'm not sure whether this is available on all target architectures. 4. Change the float4in and float8in functions to accept denormal float literals. This has a nice side-effect of enabling full IEEE 754 floating point range. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Start new timeline for PITR
On Fri, Jun 10, 2011 at 01:20:25AM -0400, Robert Haas wrote: > On Thu, Jun 9, 2011 at 8:59 PM, Tom Lane wrote: > > David Fetter writes: > >> The nice people at VMware, where I work, have come up with a small > >> patch to allow PITR to create a new timeline. This is useful in cases > >> where you're using filesystem snapshots of $PGDATA which may be old. > > > > Huh? We already start a new timeline when doing a non-crash-recovery > > replay scenario. > > > > The code looks pretty confused too, which makes it difficult to > > reverse-engineer what your point is. > > I am guessing that they are taking a filesystem snapshot, and then > using that to fire up PG. So to PG it looks like a crash recovery, > but they want a new timeline anyway. > > That's pretty much it. More detail: Let's imagine we're taking filesystem snapshots each day by whatever means. We're also archiving xlogs, but only have space for 48 hours' worth. Now we want to recover to 3 days ago, but there are no WALs from that time, so we do a crash recovery from the filesystem snapshot. Doing continuous archiving from this conflicts with the existing WALs, which we solve by creating a new timeline. This also allows subsequent PITR to other times on the original timeline. Josh B pointed out that since this option to true conflicts with another option, having both should prevent recovery from even starting, and I'll work up a patch for this tonight or at latest tomorrow. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI work for 9.1
"Kevin Grittner" wrote: > I am in full agreement with this patch. I found that pgindent would like to tweak whitespace in three places in that patch, and I found an unnecessary include that I would like to remove. Normally, I would post a new version of the patch with those adjustments, but there's been a disquieting tendency for people to not look at what's actually happening with revisions of this patch and act like the sky is falling with each refinement. Let me be clear: each posted version of this patch has been safe and has been an improvement on what came before. Dan and I didn't disagree about what to do at any point; Dan figured out what to do with two calls I left alone because I faded before I could work out how to deal with them. Essentially we collaborated on-list rather than discussing things off-list and posting the end result. Perhaps that was a bad choice, but time was short and I had hopes that a change people had requested could be included in beta2. Here are the tweaks which should be applied after Dan's last version of the patch. If people prefer, I'll post a roll-up including them. http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=0258af4168a130af0d1e52294b248d54715b5f72 http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=bb951bacd9700cdbddb0beba338a63bd301b200d -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Creating new remote branch in git?
On 06/10/2011 11:26 AM, Alex Hunsaker wrote: On Fri, Jun 10, 2011 at 00:53, Greg Smith wrote: 4) Use a system with git>=1.7.0, which adds: git branch --set-upstream REL9_1_STABLE origin/REL9_1_STABLE But wait! there's more! 5) delete your local branch and recreate it after you push the branch out That's what I've done in the past, and it works, but I suspect #4 is the best answer. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Creating new remote branch in git?
On Fri, Jun 10, 2011 at 00:53, Greg Smith wrote: > On 06/10/2011 12:19 AM, Alex Hunsaker wrote: >> >> It looks like if you push the remote branch first everything should work >> nicely: >> git checkout master >> git push origin origin:refs/heads/REL9_1_STABLE >> git fetch # fetch the new branch >> git checkout REL9_1_STABLE > > This is basically the state of the art right now for the most frequently > deployed versions of git. I don't think checking out master first is > necessary though. I assume it will use the current HEAD as the branch point which is why I checked out master :) > Potentially useful automation/trivia for alternate approaches includes: > > 1) Write a little script to do this messy chore, so you don't have to > remember this weird "create a new branch using a full refspec" syntax. > There is an example named git-create-branch along with a short tutorial on > this subject at > http://www.zorched.net/2008/04/14/start-a-new-branch-on-your-remote-git-repository/ > > 2) Use git_remote_branch https://github.com/webmat/git_remote_branch which > is the swiss army knife of remote branch hackery automation. > > 3) Rather than manually hack the config files, use "git config" to do it. > Not sure if this is completely workable, but something like this might > connect the newly created branch to your local one after pushing it out, > without actually opening the config with an editor: > > git config branch.REL9_1_STABLE.remote origin > git config branch.REL9_1_STABLE.merge refs/heads/REL9_1_STABLE > > 4) Use a system with git>=1.7.0, which adds: > > git branch --set-upstream REL9_1_STABLE origin/REL9_1_STABLE But wait! there's more! 5) delete your local branch and recreate it after you push the branch out git branch REL9_1_STABLE git push origin REL9_1_STABLE # -f is short hand, you could git branch -d REL9_1_STABLE and re-make it git branch -f REL9_1_STABLE origin/REL9_1_STABLE 6) use push -u Its git so there are probably another half dozen ways to do this... What Im curious about is what is the 'proper' way? Or is that a nonsensical question when talking about git :-P -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small SSI issues
Heikki Linnakangas wrote: > Here's a bunch of small issues that I spotted: Thanks for going over it again. It is encouraging that you didn't spot any *big* issues. > * The oldserxid code is broken for non-default BLCKSZ. >o The warning will come either too early or too late Good point. That part is easily fixed -- if we want to keep the warning, in light of the next few points. >o There is no safeguard against actually wrapping around the > SLRU, just the warning Any thoughts on what we should do instead? If someone holds open a transaction long enough to burn through a billion transaction IDs (or possibly less if someone uses a smaller BLCKSZ), should we generate a FATAL error? Of course, one other option would be to allow more SLRU segment files, as you raised on another thread. Then this issue goes away because they would hit other, existing, protections against transaction wraparound first. >o I'm suspicious of the OldSerXidPagePrecedesLogically() logic > with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger > than necessary to cover the whole range of 2^32 transactions, > so at high XIDs, say 2^32-1, doesn't it incorrectly think > that low XIDs, say 10, are in the past, not the future? I will look that over to see; but if this is broken, then one of the other SLRU users is probably broken, because I think I stole this code pretty directly from another spot. >> /* >> * Keep a pointer to the currently-running serializable >> * transaction (if any) for quick reference. >> * TODO SSI: Remove volatile qualifier and the then-unnecessary >> * casts? >> */ >> static volatile SERIALIZABLEXACT *MySerializableXact = >> InvalidSerializableXact; > > * So, should we remove it? In most places where contents of >MySerializableXact are modified, we're holding >SerializableXactHashLock in exclusive mode. However, there's >two exceptions: > >o GetSafeSnapshot() modifies MySerializableXact->flags without > any lock. It also inspects > MySerializableXact->possibleUnsafeConflicts without a lock. > What if somone sets some other flag in the flags bitmap just > when GetSafeSnapshot() is about to set > SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost > :-(. > >o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE > without holding a lock. The same danger is here if someone > else tries to set some other flag concurrently. > >I think we should simply acquire SerializableXactHashLock in >GetSafeSnapshot(). It shouldn't be so much of a hotspot that it >would make any difference in performance. >CheckForSerializableConflictIn() might be called a lot, >however, so maybe we need to check if the flag is set first, >and only acquire the lock and set it if it's not set already. OK. Do checks such as that argue for keeping the volatile flag, or do you think we can drop it if we make those changes? (That would also allow dropping a number of casts which exist just to avoid warnings.) > * Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in >ReleasePredicateLocks() for a fleeting moment while the >function releases all conflicts and locks held by the >transaction, and finally the sxact struct itself containing the >flag. I think that one can go away. It had more of a point many months ago before we properly sorted out what belongs in PreCommit_CheckForSerializationFailure() and what belongs in ReleasePredicateLocks(). The point at which we reached clarity on that and moved things around, this flag probably became obsolete. >Also, isn't a transaction that's already been marked for death >the same as one that has already rolled back, for the purposes >of detecting conflicts? Yes. We should probably ignore any marked-for-death transaction during conflict detection and serialization failure detection. As a start, anywhere there is now a check for rollback and not for this, we should change it to this. There may be some places this can be checked which haven't yet been identified and touched. > * The HavePartialClearThrough/CanPartialClearThrough mechanism > needs a better explanation. The only explanation currently is > this: > >> if (--(PredXact->WritableSxactCount) == 0) >> { >> /* >> * Release predicate locks and rw-conflicts in for >> * all committed transactions. There are no longer any >> * transactions which might conflict with the locks and no >> * chance for new transactions to overlap. Similarly, >> * existing conflicts in can't cause pivots, and any >> * conflicts in which could have completed a dangerous >> * structure would already have caused a rollback, so any >> * remaining ones must be benign. >> */ >> PredXact->CanPartialClearThrough = >> PredXact->LastSxactCommitSeqNo; >> } > >I
Re: [HACKERS] Operator families vs. casts
Noah, Providing my thoughts on the 'mundane' question first. On Tue, May 24, 2011 at 1:40 PM, Noah Misch wrote: > > I also had a more mundane design question in the second paragraph of [2]. It > can probably wait for the review of the next version of the patch. However, > given that it affects a large percentage of the patch, I'd appreciate any > early feedback on it. Here's the relevant part of the original post: > ATPostAlterTypeCleanup has this comment: > /* > * Re-parse the index and constraint definitions, and attach them to > the > * appropriate work queue entries. We do this before dropping because > in > * the case of a FOREIGN KEY constraint, we might not yet have > exclusive > * lock on the table the constraint is attached to, and we need to get > * that before dropping. It's safe because the parser won't actually > look > * at the catalogs to detect the existing entry. > */ > Is the second sentence true? I don't think so, so I deleted it for now. Here > is the sequence of lock requests against the table possessing the FOREIGN KEY > constraint when we alter the parent/upstream column: > > transformAlterTableStmt - ShareRowExclusiveLock > ATPostAlterTypeParse - lockmode of original ALTER TABLE > RemoveTriggerById() for update trigger - ShareRowExclusiveLock > RemoveTriggerById() for insert trigger - ShareRowExclusiveLock > RemoveConstraintById() - AccessExclusiveLock > CreateTrigger() for insert trigger - ShareRowExclusiveLock > CreateTrigger() for update trigger - ShareRowExclusiveLock > RI_Initial_Check() - AccessShareLock (3x) I think the statement in the second sentence of the comment is true. ATPostAlterTypeParse is called only from ATPostAlterTypeCleanup. It has to grab the lock on the table the constraint is attached to before dropping the constraint. What it does is it opens that relation with the same lock that is grabbed for AT_AlterColumnType subtype, i.e. AccessExclusiveLock. I think there is no preceding place in AlterTable chain, that grabs stronger lock on this relation. ShareRowExclusiveLock is taken in transformAlterTableStmt (1st function in your sequence), but this function is ultimately called from ATPostAlterTypeParse, just before the latter grabs the AccessExclusiveLock, so ultimately at the point where this comment is located no locks are taken for the table with a FOREIGN KEY constraint. Alexey. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Denormal float values break backup/restore
Marti Raudsepp writes: > Looking at utils/adt/float.c, seems that some platforms also have > other problems with the strtod() function. Maybe it's time to > implement our own, without bugs and with proper handling for denormal > float values? I put this right about on par with the occasional suggestions that we implement our own filesystem. It's not our core competency and in the end there is no value-add. If you need to work with denormals, find a platform that supports them better. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature idea: standard_quoting_identifiers property
On 10.06.2011 13:06, Grzegorz Szpetkowski wrote: What do you think about adding new postgresql.conf property, let's briefly say standard_quoting_identifiers with default value off to maintain backward compatibility, which allows to use standard upper-case equivalents (so Foo and "FOO" will be the same, but Foo and "foo" not) ? This has been suggested and rejected, or at least left unresolved because no-one has been able to figure out all the details, many times before. You should read the past discussions on this. For starters, see the list of threads listed on the TODO item on this: http://wiki.postgresql.org/wiki/Todo "Consider allowing control of upper/lower case folding of unquoted identifiers" -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Creating new remote branch in git?
On Fri, Jun 10, 2011 at 12:40 AM, Tom Lane wrote: > Joe Abbate writes: > > No, it doesn't trash anything. The branch is just an additional > > "pointer" to 'master' (at that point in time). I recommend taking a > > look at this: > > > http://progit.org/book/ch3-5.html > > Yes, I was reading exactly that before posting. It talks about pushing > a branch you've created locally, and it talks about what happens when > others pull that down, and it's about as clear as mud w/r/t how the > original pusher sees the remote branch. What I want is to end up > with my local branch tracking the remote branch in the same way as if > I'd not been the branch creator. Preferably without having to do > anything as ugly as delete the branch, or re-clone, or manually hack > config files. This has got to be a use case that the git authors > have heard of before... > I have done this quite a few times on GitHub and has never barfed on me in any surprising way: # make sure local master is up-to-date with origin/master, and then do git checkout master git checkout -b new_branch git push origin new_branch >From here on I work as if that new_branch was handed to me from the origin. I believe this also takes care of setting up the .git/config file properly. Just in case it is needed: to delete a branch on remote, just do git push origin :new_branch It will keep your local branch (if you have it), but will nuke the remote branch. Regards, PS: Play a bit on GitHub -- Gurjeet Singh EnterpriseDB Corporation The Enterprise PostgreSQL Company
Re: [HACKERS] Core Extensions relocation
Hi, Greg Smith writes: > Following up on the idea we've been exploring for making some extensions > more prominent, attached is the first rev that I think may be worth > considering seriously. Main improvement from the last is that I reorganized > the docs to break out what I decided to tentatively name "Core Extensions" > into their own chapter. No longer mixed in with the rest of the contrib > modules, and I introduce them a bit differently. If you want to take a > quick look at the new page, I copied it to > http://www.2ndquadrant.us/docs/html/extensions.html Thanks a lot for working on this! I have two remarks here. First, I think that the “core extensions” (+1 for this naming) should not be found in a documentation appendix, but in the main documentation, as a new Chapter in Part II where we list data types and operators and system functions. Between current chapters 9 and 10 would be my vote. Then, I think the angle to use to present “core extensions” would be that those are things maintained like the core server, but that you might not need at all. There's no SQL level feature that rely on them, it's all “extra”, but it's there nonetheless, because you might need it. Other than that, +1 for your patch. I'd stress out that I support your idea to split the extensions at the source level, I think that's really helping to get the message out: that is trustworthy and maintained code. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Small SSI issues
It makes wonders to take a couple of months break from looking at a piece of code, and then review it in detail again. It's like a whole new pair of eyes :-). Here's a bunch of small issues that I spotted: * The oldserxid code is broken for non-default BLCKSZ. o The warning will come either too early or too late o There is no safeguard against actually wrapping around the SLRU, just the warning o I'm suspicious of the OldSerXidPagePrecedesLogically() logic with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger than necessary to cover the whole range of 2^32 transactions, so at high XIDs, say 2^32-1, doesn't it incorrectly think that low XIDs, say 10, are in the past, not the future? We've discussed these SLRU issues already, but still.. /* * Keep a pointer to the currently-running serializable transaction (if any) * for quick reference. * TODO SSI: Remove volatile qualifier and the then-unnecessary casts? */ static volatile SERIALIZABLEXACT *MySerializableXact = InvalidSerializableXact; * So, should we remove it? In most places where contents of MySerializableXact are modified, we're holding SerializableXactHashLock in exclusive mode. However, there's two exceptions: o GetSafeSnapshot() modifies MySerializableXact->flags without any lock. It also inspects MySerializableXact->possibleUnsafeConflicts without a lock. What if somone sets some other flag in the flags bitmap just when GetSafeSnapshot() is about to set SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost :-(. o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE without holding a lock. The same danger is here if someone else tries to set some other flag concurrently. I think we should simply acquire SerializableXactHashLock in GetSafeSnapshot(). It shouldn't be so much of a hotspot that it would make any difference in performance. CheckForSerializableConflictIn() might be called a lot, however, so maybe we need to check if the flag is set first, and only acquire the lock and set it if it's not set already. * Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in ReleasePredicateLocks() for a fleeting moment while the function releases all conflicts and locks held by the transaction, and finally the sxact struct itself containing the flag. Also, isn't a transaction that's already been marked for death the same as one that has already rolled back, for the purposes of detecting conflicts? * The HavePartialClearThrough/CanPartialClearThrough mechanism needs a better explanation. The only explanation currently is this: if (--(PredXact->WritableSxactCount) == 0) { /* * Release predicate locks and rw-conflicts in for all committed * transactions. There are no longer any transactions which might * conflict with the locks and no chance for new transactions to * overlap. Similarly, existing conflicts in can't cause pivots, * and any conflicts in which could have completed a dangerous * structure would already have caused a rollback, so any * remaining ones must be benign. */ PredXact->CanPartialClearThrough = PredXact->LastSxactCommitSeqNo; } If I understand that correctly, any predicate lock belonging to any already committed transaction can be safely zapped away at that instant. We don't do it immediately, because it might be expensive. Instead, we set CanPartialClearThrough, and do it lazily in ClearOldPredicateLocks(). But what is the purpose of HavePartialClearThrough? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: add GiST support for BOX @> POINT queries
2011/2/24 Andrew Tipton : > While playing around with the BOX and POINT datatypes, I was surprised to > note that BOX @> POINT (and likewise POINT <@ BOX) queries were not using > the GiST index I had created on the BOX column. The attached patch adds a > new strategy @>(BOX,POINT) to the box_ops opclass. Internally, > gist_box_consistent simply transforms the POINT into its corresponding BOX. > This is my first Postgres patch, and I wasn't able to figure out how to go > about creating a regression test for this change. (All existing tests do > pass, but none of them seem to specifically test index behaviour.) I reviewed the patch and worried about hard-wired magic number as StrategyNumber. At least you should use #define to indicate the number's meaning. In addition, the modified gist_box_consistent() is too dangerous; q_box is declared in the if block locally and is referenced, which pointer is passed to the outer process of the block. AFAIK if the local memory of each block is alive outside if block is platform-dependent. Isn't it worth adding new consistent function for those purposes? The approach in the patch as stands looks kludge to me. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Feature idea: standard_quoting_identifiers property
Forgive me if there was already some discussion about it (I can't find anyone). As documentation describes (http://www.postgresql.org/docs/9.1/static/sql-syntax-lexical.html): "Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case. For example, the identifiers FOO, foo, and "foo" are considered the same by PostgreSQL, but "Foo" and "FOO" are different from these three and each other. (The folding of unquoted names to lower case in PostgreSQL is incompatible with the SQL standard, which says that unquoted names should be folded to upper case. Thus, foo should be equivalent to "FOO" not "foo" according to the standard. If you want to write portable applications you are advised to always quote a particular name or never quote it.)" What do you think about adding new postgresql.conf property, let's briefly say standard_quoting_identifiers with default value off to maintain backward compatibility, which allows to use standard upper-case equivalents (so Foo and "FOO" will be the same, but Foo and "foo" not) ? Pros (that I see): -- SQL standard conformance -- easier migration from/to other RDBMS like Oracle DB Regards, Grzegorz Szpetkowski -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [BUG] Denormal float values break backup/restore
Hi list, I was playing around with denormal float values in response to the Itanium thread and noticed that Postgres' float conversion functions behave inconsistently, at least on Linux with glibc 2.14 It can successfully convert denormal float values to strings: marti=# select '0.25e-307'::float8/2 as val; val --- 1.25e-308 But trying to convert back to float fails: marti=# select ('0.25e-307'::float8/2)::text::float8 as val; ERROR: "1.25e-308" is out of range for type double precision The most significant impact of this is that anyone who has these values in their tables can't restore them from backup. I'm surprised nobody has reported this yet, but it seems like worthy of fixing in 9.2 at least. Looking at utils/adt/float.c, seems that some platforms also have other problems with the strtod() function. Maybe it's time to implement our own, without bugs and with proper handling for denormal float values? Also applies to float4s: marti=# select ('1.40129846432481707e-45'::float4/4)::text::float4; ERROR: value out of range: underflow Another erratic behavior of float4in: marti=# select ('1.40129846432481707e-45'::float4/2)::text; text -- 7.00649232162409e-46 marti=# select ('1.40129846432481707e-45'::float4/2)::text::float4; float4 1.4013e-45 Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TG_DEPTH patch v1
On Jan29, 2011, at 00:15 , Kevin Grittner wrote: > The people who write my paychecks have insisted on me chunking out > some items which are part of our long-term plan besides the one I've > been focusing on lately. Most of it isn't of interest to anyone > outside Wisconsin Courts, but this piece might be; so I'm posting it > and putting onto the first 9.2 CF. We'll be using it for > development starting Monday and in production in two or three > months, so it should be pretty well tested by July. :-) Here is review of the patch. The patch didn't apply cleanly anymore due to changes to the plpgsql regression test. Merging the changes was trivial though. Updated patch attached. * Regression Tests Since I had to merge them anyway, I started with looking at the regression tests. If I'm reading them correctly, the second 'raise' in tg_depth_a_tf() is never executed. tg_depth_b_tf() ends with an insert into tg_depth_c, which unconditionally throws U. Thus, tg_depth_a_tf() never gets further than the insert into tg_depth_b. This effectively means that the test fails to verify that TG_DEPTH is correctly reset after a non-exceptional return from a nested trigger invokation. * Implementation Now to the implementation. The trigger depth is incremented before calling the trigger function in ExecCallTriggerFunc() and decremented right afterwards, which seems fine - apart from the fact that the decrement is skipped in case of an error. The patch handles that by saving respectively restoring the value of pg_depth in PushTransaction() respectively {Commit,Abort}SubTransaction(). While I can't come up with a failure case for that approach, it seems rather fragile - who's to say that nested trigger invocations correspond that tightly to sub-transaction? At the very least this needs a comment explaining why this is safe, but I believe the same effect could be achieved more cleanly by a TRY/CATCH block in ExecCallTriggerFunc(). * Other in-core PLs As it stands, the patch only export tg_depth to plpgsql functions, not to functions written in one of the other in-core PLs. It'd be good to change that, I believe - otherwise the other PLs become second-class citizens in the long run. best regards, Florian Pflug tg_depth.v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers