Re: [HACKERS] Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)
On Tue, Jul 17, 2012 at 06:56:50PM -0400, Tom Lane wrote: > Robert Haas writes: > > On Mon, Jul 16, 2012 at 3:18 PM, Tom Lane wrote: > >> BTW, while we are on the subject: hasn't this split completely > >> broken the statistics about backend-initiated writes? > > > Yes, it seems to have done just that. > > This implies that nobody has done pull-the-plug testing on either > HEAD or 9.2 since the checkpointer split went in (2011-11-01), > because even a modicum of such testing would surely have shown that > we're failing to fsync a significant fraction of our write traffic. > > Furthermore, I would say that any performance testing done since > then, if it wasn't looking at purely read-only scenarios, isn't > worth the electrons it's written on. In particular, any performance > gain that anybody might have attributed to the checkpointer splitup > is very probably hogwash. > > This is not giving me a warm feeling about our testing practices. Is there any part of this that the buildfarm, or some other automation framework, might be able to handle? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FW: Patch for option in pg_resetxlog for restore from WAL files
I have uploaded the patch for new option in pg_resetxlog at below location: https://commitfest.postgresql.org/action/patch_view?id=897 This completes the implementation of Option-2 discussed in below mail. Now I will work on Option-1 (1. To compute the value of max LSN in data pages based on user input whether he wants it for an individual file, a particular directory or whole database.) > From: Amit kapila > Sent: Wednesday, July 18, 2012 7:17 PM > Patch implementing the design in below mail chain is attached with this mail. >From: Amit Kapila [mailto:amit.kap...@huawei.com] >Sent: Thursday, July 05, 2012 10:21 AM >>From: Robert Haas [mailto:robertmh...@gmail.com] >>Sent: Friday, June 22, 2012 8:59 PM >>On Fri, Jun 22, 2012 at 5:25 AM, Amit Kapila wrote: >> Based on the discussion and suggestions in this mail chain, following features can be implemented: >> >> 1. To compute the value of max LSN in data pages based on user input whether he wants it for an individual file, >> a particular directory or whole database. >> >> 2a. To search the available WAL files for the latest checkpoint record and prints the value. >> 2b. To search the available WAL files for the latest checkpoint record and recreates a pg_control file pointing at that checkpoint. >> >> I have kept both options to address different kind of corruption scenarios. > I think I can see all of those things being potentially useful. There > are a couple of pending patches that will revise the WAL format > slightly; not sure how much those are likely to interfere with any > development you might do on (2) in the meantime. With Regards, Amit Kapila.
Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)
On Fri, Jul 20, 2012 at 03:39:33PM -0400, Robert Haas wrote: > On Fri, Jul 20, 2012 at 1:52 PM, Tom Lane wrote: > > And with that too. The STRICT option is a fairly obvious security > > hazard, but who's to say there are not others? I think it'd be easier > > to make a case for forbidding a non-superuser from altering *any* > > property of a C function. I'd rather start from the point of allowing > > only what is clearly safe than disallowing only what is clearly unsafe. > > That seems like a fairly drastic overreaction. Are you going to ban > renaming it or changing the owner, which are in completely different > code paths? Yuck. Even if you only ban it for the main ALTER > FUNCTION code path, it seems pretty draconian, because it looks to me > like nearly everything else that's there is perfectly safe. I mean, > assuming the guy who wrote the C code didn't do anything completely > insane or malicious, setting GUCs or whatever should be perfectly OK. > Honestly, if you want to change something in the code, I'm not too > convinced that there's anything better than what Noah proposed > originally. How about a compromise of blocking GUC and STRICT changes while allowing everything else? We add PGC_USERSET GUCs in most releases. As long as non-superuser owners of trusted-language functions can change attached GUC settings, review for each new GUC really ought to consider that scenario. That will be easy to forget. I'm already wary about allowing changes to GUCs like sql_inheritance and search_path. By contrast, the list of ALTER FUNCTION alterations has grown slowly; the last addition before PostgreSQL 9.2 came in PostgreSQL 8.3. Anyone implementing a new alteration will be modifying AlterFunction() and have ample opportunity to notice from surrounding code the need to identify a suitable permissions check. Also, like you say, the other existing alterations are clearly safe. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] b-tree index search algorithms
Am 20.07.12 01:40, schrieb Tom Lane: RelationGetDescr(rel)->attrs[n]->attbyval Thanks! Now does 'Relation' refer to the whole table or only the columns that are supposed to be scanned? So will RelationGetDescr(rel)->attrs[0] give me the description of the first column relevant to the current b-tree scan or simply to the first column in the table? Naively I tried RelationGetDescr(rel)->attrs[scankey->sk_attno] but it results in a segmentation fault. A first version of my algorithm is running now (very simple test case) but I had to implement a workaround for the following behavior: The (binary) search is supposed to always find the first matching offset on a page, but I do not understand how this is guaranteed, since the semantics of a binary search do not guarantee this. The type being searched where it throws me off specifically is a 'chunk_id', page contents are: 1: 12000 2: 12001 3: 12002 4: 12003 5: 12004 6: 12004 7: 12005 8: 12005 9: 12006 10: 12006 11: 16393 12: 16393 13: 16394 14: 16394 15: 16395 16: 16395 Binary search finds 11 in 4 steps, interpolation search finds 12 in 3 steps. Now if there was an additional value like "17: 16396", binary search should also find 12 first, right? Regards, Samuel Vogel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bgwriter, regression tests, and default shared_buffers settings
On 07/19/2012 10:32 AM, Andrew Dunstan wrote: On 07/19/2012 10:12 AM, Tom Lane wrote: Robert Haas writes: On Wed, Jul 18, 2012 at 5:30 PM, Andrew Dunstan wrote: Or we could provide an initdb flag which would set an upper bound on shared_buffers, and have make check (at least) use it. How about a flag that sets the exact value for shared_buffers, rather than a maximum? I think a lot of users would like initdb --shared-buffers=8GB or whatever. That would be significantly harder to deploy in the buildfarm context. We don't know that all the animals are capable of coping with 16MB (or whatever target we settle on for make check) today. Yeah - unless we allow some fallback things could get ugly. I do like the idea of allowing a settable ceiling on shared_buffers instead of having it completely hardcoded as now. Here's a draft patch. cheers andrew diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml index 08ee37e..69cf625 100644 --- a/doc/src/sgml/ref/initdb.sgml +++ b/doc/src/sgml/ref/initdb.sgml @@ -220,6 +220,17 @@ PostgreSQL documentation + --max-shared-buffers=memory + + +Specify the maximum amount of memory to try for setting shared_buffers. +The default is 128Mb. It can be specified in Gigabytes (e.g. 8Gb), +Megabytes (e.g. 32Mb) or blocks (with no suffix). + + + + + -N --nosync diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index 4292231..44243b9 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -120,6 +120,7 @@ static bool noclean = false; static bool do_sync = true; static bool show_setting = false; static char *xlog_dir = ""; +static int max_shared_buffers = 16384; /* internal vars */ @@ -227,6 +228,7 @@ static bool check_locale_name(int category, const char *locale, static bool check_locale_encoding(const char *locale, int encoding); static void setlocales(void); static void usage(const char *progname); +static void set_max_shared_buffers(const char * arg); #ifdef WIN32 static int CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo); @@ -1071,7 +1073,7 @@ test_config_settings(void) static const int trial_conns[] = { 100, 50, 40, 30, 20, 10 }; - static const int trial_bufs[] = { + static const int preset_trial_bufs[] = { 16384, 8192, 4096, 3584, 3072, 2560, 2048, 1536, 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 50 @@ -1079,14 +1081,40 @@ test_config_settings(void) char cmd[MAXPGPATH]; const int connslen = sizeof(trial_conns) / sizeof(int); - const int bufslen = sizeof(trial_bufs) / sizeof(int); + const int preset_bufslen = sizeof(preset_trial_bufs) / sizeof(int); + int *trial_bufs; int i, + max_bufs = preset_trial_bufs[0], + n_extra = 1, + bufslen = 1, status, test_conns, test_buffs, ok_buffers = 0; + while (max_bufs * 2 < max_shared_buffers) + { + n_extra++; + max_bufs *= 2; + } + + trial_bufs = pg_malloc((preset_bufslen + n_extra) * sizeof(int)); + + trial_bufs[0] = max_shared_buffers; + + while (max_bufs > preset_trial_bufs[0]) + { + trial_bufs[bufslen++] = max_bufs; + max_bufs = max_bufs / 2; + } + for (i= 0; i < preset_bufslen; i++) + { + if (preset_trial_bufs[i] >= max_shared_buffers) + continue; + trial_bufs[bufslen++] = preset_trial_bufs[i]; + } + printf(_("selecting default max_connections ... ")); fflush(stdout); @@ -1122,7 +1150,8 @@ test_config_settings(void) for (i = 0; i < bufslen; i++) { /* Use same amount of memory, independent of BLCKSZ */ - test_buffs = (trial_bufs[i] * 8192) / BLCKSZ; + /* Avoid overflow by doing the operations in the right order */ + test_buffs = BLCKSZ <= 8192 ? trial_bufs[i] * (8192 / BLCKSZ) : trial_bufs[i] / (BLCKSZ / 8192); if (test_buffs <= ok_buffers) { test_buffs = ok_buffers; @@ -1143,7 +1172,9 @@ test_config_settings(void) } n_buffers = test_buffs; - if ((n_buffers * (BLCKSZ / 1024)) % 1024 == 0) + if ((n_buffers * (BLCKSZ / 1024)) % (1024 * 1024) == 0) + printf("%dGB\n", (n_buffers * (BLCKSZ / 1024)) / (1024 * 1024)); + else if ((n_buffers * (BLCKSZ / 1024)) % 1024 == 0) printf("%dMB\n", (n_buffers * (BLCKSZ / 1024)) / 1024); else printf("%dkB\n", n_buffers * (BLCKSZ / 1024)); @@ -2740,9 +2771,11 @@ usage(const char *progname) "set default locale in the respective category for\n" "new databases (default taken from environment)\n")); printf(_(" --no-locale equivalent to --locale=C\n")); + printf(_(" --max-shared-buffers=MEMORY\n" + "maximum shared buffers setting to try, default 128Mb\n")); printf(_(" --pwfile=FILE read password for the new superuser from file\n")); printf(_(" -T, --text-search-config=CFG\n" - "default text search conf
Re: [HACKERS] isolation check takes a long time
On Fri, Jul 20, 2012 at 01:39:34PM -0400, Alvaro Herrera wrote: > Excerpts from Noah Misch's message of mar jul 17 16:28:32 -0400 2012: > > The foreign key tests, however, would benefit > > from running under all three isolation levels. Let's control it per-spec > > instead of repeating the entire suite. > > Understood and agreed. Maybe we could use a new directive in the spec > file format for this. I was pondering something like this: setting "i-rc" "isolation" = "READ COMMITTED" setting "i-rr" "isolation" = "REPEATABLE READ" session "s1" setup { BEGIN TRANSACTION ISOLATION LEVEL :isolation; } step "foo" { SELECT 1; } permutation "i-rc" "foo" permutation "i-rr" "foo" That is, introduce psql-style variable substitutions in per-session "setup", "step" and "teardown" directives. Introduce the "setting" directive to declare possible values for each variable. Each permutation may name settings as well as steps. Order within the permutation would not matter; we could allow them anywhere in the list or only at the beginning. When the tester generates permutations, it would include all variable setting combinations. Thoughts? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
2012-07-17 06:32 keltezéssel, Alvaro Herrera írta: Excerpts from Tom Lane's message of vie jul 13 18:23:31 -0400 2012: Boszormenyi Zoltan writes: Try SET deadlock_timeout = 0; Actually, when I try that I get ERROR: 0 is outside the valid range for parameter "deadlock_timeout" (1 .. 2147483647) So I don't see any bug here. I committed this patch without changing this. If this needs patched, please speak up. I also considered adding a comment on enable_timeout_after about calling it with a delay of 0, but refrained; if anybody thinks it's necessary, suggestions are welcome. Thanks for committing this part. Attached is the revised (and a lot leaner, more generic) lock timeout patch, which introduces new functionality for the timeout registration framework. The new functionality is called "extra timeouts", better naming is welcome. Instead of only the previously defined (deadlock and statement) timeouts, the "extra" timeouts can also be activated from within ProcSleep() in a linked way. The lock timeout is a special case of this functionality. To show this, this patch is split into two again to make reviewing easier. This way, the timeout framework is really usable for external modules, as envisioned by you guys Also, rewriting statement and deadlock timeouts using this functionality and unifying the two registration interfaces may be possible later. But messing up proven and working code is not in the scope of this patch or my job at this point. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ diff -durpN postgresql/src/backend/port/posix_sema.c postgresql.1/src/backend/port/posix_sema.c --- postgresql/src/backend/port/posix_sema.c 2012-04-16 19:57:22.438915489 +0200 +++ postgresql.1/src/backend/port/posix_sema.c 2012-07-22 21:34:50.475375677 +0200 @@ -24,6 +24,7 @@ #include "miscadmin.h" #include "storage/ipc.h" #include "storage/pg_sema.h" +#include "utils/timeout.h" #ifdef USE_NAMED_POSIX_SEMAPHORES @@ -313,3 +314,31 @@ PGSemaphoreTryLock(PGSemaphore sema) return true; } + +/* + * PGSemaphoreTimedLock + * + * Lock a semaphore (decrement count), blocking if count would be < 0 + * Return if lock_timeout expired + */ +bool +PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK) +{ + int errStatus; + bool timeout; + + do + { + ImmediateInterruptOK = interruptOK; + CHECK_FOR_INTERRUPTS(); + errStatus = sem_wait(PG_SEM_REF(sema)); + ImmediateInterruptOK = false; + timeout = ExtraTimeoutCondition(); + } while (errStatus < 0 && errno == EINTR && !timeout); + + if (timeout) + return false; + if (errStatus < 0) + elog(FATAL, "sem_wait failed: %m"); + return true; +} diff -durpN postgresql/src/backend/port/sysv_sema.c postgresql.1/src/backend/port/sysv_sema.c --- postgresql/src/backend/port/sysv_sema.c 2012-05-14 08:20:56.284830580 +0200 +++ postgresql.1/src/backend/port/sysv_sema.c 2012-07-22 21:34:50.476375683 +0200 @@ -27,6 +27,7 @@ #include "miscadmin.h" #include "storage/ipc.h" #include "storage/pg_sema.h" +#include "utils/timeout.h" #ifndef HAVE_UNION_SEMUN @@ -492,3 +493,36 @@ PGSemaphoreTryLock(PGSemaphore sema) return true; } + +/* + * PGSemaphoreTimedLock + * + * Lock a semaphore (decrement count), blocking if count would be < 0 + * Return if lock_timeout expired + */ +bool +PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK) +{ + int errStatus; + bool timeout; + struct sembuf sops; + + sops.sem_op = -1; /* decrement */ + sops.sem_flg = 0; + sops.sem_num = sema->semNum; + + do + { + ImmediateInterruptOK = interruptOK; + CHECK_FOR_INTERRUPTS(); + errStatus = semop(sema->semId, &sops, 1); + ImmediateInterruptOK = false; + timeout = ExtraTimeoutCondition(); + } while (errStatus < 0 && errno == EINTR && !timeout); + + if (timeout) + return false; + if (errStatus < 0) + elog(FATAL, "semop(id=%d) failed: %m", sema->semId); + return true; +} diff -durpN postgresql/src/backend/port/win32_sema.c postgresql.1/src/backend/port/win32_sema.c --- postgresql/src/backend/port/win32_sema.c 2012-06-11 06:22:48.137921483 +0200 +++ postgresql.1/src/backend/port/win32_sema.c 2012-07-22 21:34:50.476375683 +0200 @@ -16,6 +16,7 @@ #include "miscadmin.h" #include "storage/ipc.h" #include "storage/pg_sema.h" +#include "utils/timeout.h" static HANDLE *mySemSet; /* IDs of sema sets acquired so far */ static int numSems; /* number of sema sets acquired so far */ @@ -209,3 +210,65 @@ PGSemaphoreTryLock(PGSemaphore sema) /* keep compiler quiet */ return false; } + +/* + * PGSemaphoreTimedLock + * + * Lock a semaphore (decrement count), blocking if count would be < 0. + * Serve the interrupt if interruptOK is true. + * Return if lock_timeout expired. + */ +bool +PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK) +{ + DWORD ret; + HANDLE wh[2]; + boo