Re: [HACKERS] GiST range-contained-by searches versus empty ranges
I wrote: > I started to wonder why the test in range_gist_consistent_int() for > RANGESTRAT_CONTAINED_BY was "return true" (ie, search the entire index) > rather than range_overlaps, which is what is tested in the comparable > case in rtree_internal_consistent(). The regression tests showed me > how come: an empty-range index entry should be reported as being > contained by anything, but range_overlaps won't necessarily find such > entries. (The rtree case is all right because there is no equivalent of > an empty range in boxes, circles, or polygons.) > I am not satisfied with this state of affairs though: range_contained_by > really ought to be efficiently indexable, but with the current coding > an index search is nearly useless. Also, so far as I can see, the > current penalty function allows empty-range entries to be scattered > basically anywhere in the index, making a search for "= empty" pretty > inefficient too. > The first solution that comes to mind is to make the penalty and > picksplit functions forcibly segregate empty ranges from others, that is > a split will never put empty ranges together with non-empty ones. Then, > we can assume that a non-empty internal node doesn't represent any empty > leaf entries, and avoid descending to it when it doesn't overlap the > target range. Then the equality-search case could be improved too. After looking through the code a bit, it seems that this isn't possible to solve with a localized fix after all. Yes, the opclass-specific picksplit function can positively guarantee to separate empty ranges from others when it splits a page ... but it doesn't have control over what happens when an item is added to an index without a page split occurring. Consider a scenario where we have a multi-page GiST index containing no empty ranges, and an empty range item needs to be added. As the code is currently written, the central GiST code will choose one of the existing root-page items to add the new item to; there is no way to persuade it that a new top-level item would be a better idea. What's more, if we try to do so by having the penalty function return infinity for adding an empty range to a nonempty item, we'll end up with empty ranges being randomly added to any/all of the items, since we'll get the exact same penalty for all the nonempty items. So, far from segregating the empties, we'll just end up with them cluttered all over the index, just like now. This is not really a new problem. There is code in there that is trying to segregate NULLs from non-NULL entries, and it is equally broken/ineffective. I'm inclined to propose that we should add some logic to say that merging a new item into an existing one is forbidden if the penalty function returns plus-infinity for the case. If all existing items on a page return infinity, a new item must be added to the page (possibly causing a page split) instead of inserting into any existing one. (Of course, gistpenalty() should be fixed to return infinity, not just a randomly chosen large value, for the null-and-not-null case.) Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
Andres Freund wrote: > All current x86 cpus use 64bytes. That matches what I found in recent research on the topic. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
On Saturday, November 26, 2011 11:39:23 PM Robert Haas wrote: > On Sat, Nov 26, 2011 at 5:28 PM, Andres Freund wrote: > > On Saturday, November 26, 2011 09:52:17 PM Tom Lane wrote: > >> I'd just as soon keep the fields in a logical order. > > > > Btw, I don't think the new order is necessarily worse than the old one. > > You forget to attach the benchmark results. > > My impression is that cache lines on modern hardware are 64 or 128 > *bytes*, in which case this wouldn't matter a bit. > > But testing is even better than guessing. Being prodded like that I ran a very quick benchmark on my workstation. Unfortunately that means I cannot work during the time which is why I kept it rather short... That machine has 2 E5520@2.27GHz which means 2(sockets) * 4(cores) * 2(threads) and 24GB of ram. Data was initialized with: pgbench -h /tmp/ --unlogged-tables -i -s 20 pgbench pgbench -h /tmp/ pgbench -S -j 16 -c 16 -T 60 origsnap: 92825.743958 93145.110901 93389.915474 93175.482351 reordersnap:93560.183272 93613.333494 93495.263012 93523.368489 pgbench -h /tmp/ pgbench -S -j 32 -c 32 -T 60 origsnap: 81846.743329 81545.175672 81702.755226 81576.576435 81228.154119 81546.047708 81421.436262 reordersnap:81823.479196 81787.784508 81820.242145 81790.263415 81762.421592 81496.333144 81732.088876 At that point I noticed I had accidentally run with a nearly stock config... An even shorter run with a more approrioate config yielded: pgbench -h /tmp/ pgbench -S -j 32 -c 32 -T 20 origsnap: 102234.664020 102003.449741 102119.509053 101722.410387 101973.651318 102056.440561 reordersnap:103444.877879 103385.08 103302.318923 103372.659486 103330.157612 103313.833821 Looks like a win to me. Even on this comparatively small machine. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
On Saturday, November 26, 2011 11:39:23 PM Robert Haas wrote: > On Sat, Nov 26, 2011 at 5:28 PM, Andres Freund wrote: > > On Saturday, November 26, 2011 09:52:17 PM Tom Lane wrote: > >> I'd just as soon keep the fields in a logical order. > > > > Btw, I don't think the new order is necessarily worse than the old one. > > You forget to attach the benchmark results. > > My impression is that cache lines on modern hardware are 64 or 128 > *bytes*, in which case this wouldn't matter a bit. All current x86 cpus use 64bytes. The 2nd 128bit reference was a typo. Sorry for that. And why is 72=>56 *bytes* (I even got that one right) not relevant for 64byte cachelines? And yea. I didn't add benchmark results. I don't think I *have* to do that when making suggestions to somebody trying to improve something specific. I also currently don't have hardware where I can sensibly run at a high enough concurrency to see that GetSnapshotData takes ~40% of runtime. Additional cacheline references around synchronized access can hurt to my knowledge... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
On Sat, Nov 26, 2011 at 5:28 PM, Andres Freund wrote: > On Saturday, November 26, 2011 09:52:17 PM Tom Lane wrote: >> I'd just as soon keep the fields in a logical order. > Btw, I don't think the new order is necessarily worse than the old one. You forget to attach the benchmark results. My impression is that cache lines on modern hardware are 64 or 128 *bytes*, in which case this wouldn't matter a bit. But testing is even better than guessing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
On Saturday, November 26, 2011 09:52:17 PM Tom Lane wrote: > I'd just as soon keep the fields in a logical order. Btw, I don't think the new order is necessarily worse than the old one. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
On Saturday, November 26, 2011 09:52:17 PM Tom Lane wrote: > Andres Freund writes: > > You could also try if it makes a difference reducing SnapshotData to one > > instead of two cachelines. The data itself fits into one without > > problems. Trivial patch attached. > On what grounds do you argue that this patch gets SnapshotData into one > cacheline today (and on what hardware), or that it will continue to do > so in the future? If we're this close to the edge then any future > addition to the struct will destroy the point. I'd just as soon keep > the fields in a logical order. To my knowledge there is no current and supported (i.e. I don't count Itanium) hardware that doesn't have 64bit cachelines except in the embedded market. I also don't see a movement towards 128bit cpus or anything else requiring bigger pointers. If platforms with 128bit cachelines or such would gain popularity - nothing would be lost by that change. Which change are you "afraid" of? Now the holes I talked about obviously were on a 64bit machine. But there are a few situations where improving layout so it fits with 8byte alignment for pointers makes the situation worse for 32bit. Looking a bit careful at the changes one does should catch those problems. Also its not *that* close to overflowing again. It was 72 bytes before, now its 56 on a 64bit x86 machine with linux abi. I agree that logical order is important. I don't propose changing all structs in pg mechanically.+ Its sad that there is no sensible method to let the compiler safely reorder struct members accross compiler (-versions) and platforms... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] PostgreSQL fails to build with 32bit MinGW-w64
On 11/24/2011 06:29 AM, Lars Kanis wrote: Isn't it better to check the value of macros itsef rather than checking for system dependent macros that does not directly relate to the issue? specifically for getaddrinfo.c case I think #if EAI_NODATA != EAI_NONAME is a better check than checking for #if !defined(__MINGW64_VERSION_MAJOR)&& !defined(WIN32_ONLY_COMPILER) /* MSVC/WIN64 duplicate */ Yes it's better and it works for all described test environments. For the win32.h, I really don't understand why _WINSOCKAPI_ was defined before some google suggests that defining _WINSOCKAPI_ before prevents inclusion of winsock.h but that does not have relation to inclusion of and if is included first, it should be ok. If this guess is right, perhaps it could be better to remove the three lines. #if !defined(WIN64) || defined(WIN32_ONLY_COMPILER) #define _WINSOCKAPI_ #endif No, this broke some compilers, IIRC (probably the native mingw compiler, which is in use by several buildfarm members). Getting this right was very tricky and time-consuming when I was adding support for the 64 bit mingw-w64 compiler, and there were a couple of rounds of breakage. I'm therefore much more inclined to go the way of your earlier patch, which seems much less risky. I only changed this for consistency. For me, it works without that define in all test environments, too. +/* __MINGW64_VERSION_MAJOR is related to both 32/64 bit gcc compiles by + * mingw-w64, however it gots defined only after Why not use __MINGW32__, which is defined without including any headers? Because it's defined by other than mingw-w64 compilers. We have a bunch of compilers to support here. There are LOTS of compiler scenarios on Windows (several versions of MSVC, 32bit and 64bit mingw-w64, native mingw gcc, and a couple of Cygwin based compilers), and keeping track of them all and making sure they don't break can be a pain. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
Andres Freund writes: > You could also try if it makes a difference reducing SnapshotData to one > instead of two cachelines. The data itself fits into one without problems. > Trivial patch attached. On what grounds do you argue that this patch gets SnapshotData into one cacheline today (and on what hardware), or that it will continue to do so in the future? If we're this close to the edge then any future addition to the struct will destroy the point. I'd just as soon keep the fields in a logical order. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
Hi, On Saturday, November 26, 2011 04:52:50 PM Pavan Deolasee wrote: > I think now that we have reduced the run time of the function itself, > we should now try to reduce the number of times the function is > called. Robert proposed a way to reduce the number of calls per > transaction. I think we can go one more step further and reduce the > number for across the transactions. You could also try if it makes a difference reducing SnapshotData to one instead of two cachelines. The data itself fits into one without problems. Trivial patch attached. Generally I think we should check that for most of the more commonly used strutures, we have many with too much padding. Andres diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h index 859e52a..b9c2209 100644 --- a/src/include/utils/snapshot.h +++ b/src/include/utils/snapshot.h @@ -46,13 +46,14 @@ typedef struct SnapshotData */ TransactionId xmin; /* all XID < xmin are visible to me */ TransactionId xmax; /* all XID >= xmax are invisible to me */ - uint32 xcnt; /* # of xact ids in xip[] */ TransactionId *xip; /* array of xact IDs in progress */ + uint32 xcnt; /* # of xact ids in xip[] */ /* note: all ids in xip[] satisfy xmin <= xip[i] < xmax */ int32 subxcnt; /* # of xact ids in subxip[] */ TransactionId *subxip; /* array of subxact IDs in progress */ bool suboverflowed; /* has the subxip array overflowed? */ bool takenDuringRecovery; /* recovery-shaped snapshot? */ + bool copied; /* false if it's a static snapshot */ /* * note: all ids in subxip[] are >= xmin, but we don't bother filtering @@ -61,7 +62,6 @@ typedef struct SnapshotData CommandId curcid; /* in my xact, CID < curcid are visible */ uint32 active_count; /* refcount on ActiveSnapshot stack */ uint32 regd_count; /* refcount on RegisteredSnapshotList */ - bool copied; /* false if it's a static snapshot */ } SnapshotData; /* -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] psql line number reporting from stdin
There is a long-standing oddity in psql that running psql -f foo.sql returns error messages with file name and line number, like psql:foo.sql:1: ERROR: syntax error at or near "foo" but running psql < foo.sql does not. I suggest we change the latter to print psql::1: ERROR: syntax error at or near "foo" Other examples for the use of the spelling "" in this context include gcc and slonik. Error messages printed in interactive mode will not be affected, of course. Patch attached. diff --git i/src/bin/psql/common.c w/src/bin/psql/common.c index 5ab736e..18260f1 100644 --- i/src/bin/psql/common.c +++ w/src/bin/psql/common.c @@ -166,6 +166,8 @@ psql_error(const char *fmt,...) if (pset.inputfile) fprintf(stderr, "%s:%s:" UINT64_FORMAT ": ", pset.progname, pset.inputfile, pset.lineno); + else if (!isatty(fileno(stdin))) + fprintf(stderr, "%s::" UINT64_FORMAT ": ", pset.progname, pset.lineno); va_start(ap, fmt); vfprintf(stderr, _(fmt), ap); va_end(ap); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Perl xsubpp
On 10/12/2011 08:55 PM, Alex Hunsaker wrote: On Wed, Oct 12, 2011 at 17:53, David E. Wheeler wrote: On Sep 15, 2011, at 3:04 PM, Alex Hunsaker wrote: Close, seems I was wrong about the typemap ExtUtils::ParseXS does not install a new one so we still need to point to the one in privlib. Also xsubpp is not executable so the test should be -r or something. Also don't think we should change the configure switch tests to test XSUBPPDIR. Find those plus some minor typos fixed in the attached. -- Doesn't look like this has been applied yet. I think it ought to be backported, too, frankly. DId I miss it? Nah, probably should add it to the next commit fest so it does not get forgotten. committed. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Notes on implementing URI syntax for libpq
Excerpts from Robert Haas's message of Thu Nov 24 13:57:17 +0200 2011: > > I think it would be really weird not to support user:pw@host:port. You can > presumably also support the JDBC style for backward compatibility, but I > don't think we should adopt that syntax as project standard. By the way, if we're already considering this, what about special syntax for SSL, instead of the JDBC's "&ssl=true" thingy? Given that the default sslmode is "prefer" I assume libpq tries SSL first, then falls back to plain text if that's not available. To me, it looks much more natural if the fact that SSL is/should be used is stated early in the URI syntax, like: "https://";, "svn+ssh://", etc., rather than in the query parameters (if the parameters were to be passed to remote service to process, like it's done with HTTP[S], this would not make any sense at all.) But given that sslmode can currently be either of: "disable", "allow", "prefer", "require", "verify-ca" or "verify-full" (and who knows if any new allowed mode could show up later,) allowing "&sslmode=whatever" makes sense. Note, that this is not the same as "&ssl=whatever". So how about this: postgresql:ssl://user:pw@host:port/dbname?sslmode=... The "postgresql:ssl://" designator would assume "sslmode=require", if not overriden in extra parameters and "postgresql://" would imply "sslmode=prefer". And to disable SSL you would pick either designator and append "sslmode=disable". The JDBC's "ssl=true" will translate to "sslmode=require". If we can decide on this, we should also put reasonable effort into making JDBC support the same syntax. Thoughts? -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why do we need two snapshots per query?
Hi, Robert Haas writes: > On Sun, Nov 13, 2011 at 8:57 PM, Robert Haas wrote: >> In the -M extended case, we take a snapshot from exec_parse_message(), >> and the same two in the exec_bind_message() call that are taken in the >> -M prepared case. So reducing the prepared case from two snapshots to >> one will reduce the extended case from three snapshots to two, thus >> saving one snapshot per query regardless of how it's executed. I like the idea and I think it's better semantics to use the same snapshot for planning and executing in the simple query case. I didn't try to reproduce the performance benefits seen by Robert here, nor did I tried to double check to compilation warnings etc. I guess that reviewing a commiter's patch allows for being not as thorough :) > + /* Done with the snapshot used for parameter I/O and parsing/planning */ > + if (snapshot_set) > + PopActiveSnapshot(); This comment needs adjusting. > diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c > index 466727b..c41272b 100644 > --- a/src/backend/tcop/pquery.c > +++ b/src/backend/tcop/pquery.c > @@ -455,7 +455,7 @@ FetchStatementTargetList(Node *stmt) > * tupdesc (if any) is known. > */ > void > -PortalStart(Portal portal, ParamListInfo params, Snapshot snapshot) > +PortalStart(Portal portal, ParamListInfo params, bool use_active_snapshot) You need to be editing the comments for this function. To be specific you didn't update this text: * The caller can optionally pass a snapshot to be used; pass InvalidSnapshot * for the normal behavior of setting a new snapshot. This parameter is * presently ignored for non-PORTAL_ONE_SELECT portals (it's only intended * to be used for cursors). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Notes on implementing URI syntax for libpq
Excerpts from Robert Haas's message of Thu Nov 24 15:59:08 +0200 2011: > > I think we could do something like: > > postgresql://user:pw@host:port/database?param1=val1¶m2=val2¶m3=val3&... I wonder if this should be allowed syntax (i.e. specify a user, but connect locally, so leave 'host' to be an empty string): postgresql://user@/ Furthermore, if we want to connect locally, but to a non-default port: postgresql://user@:5433/ I would also think that if one is to specify the password in the URI, and the password happen to contain the @-sign (e.g. "!@#$%^",) it should be percent-encoded, like: postgresql://user:!%40#$%^@/ Reasonable? -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] GiST range-contained-by searches versus empty ranges
I started to wonder why the test in range_gist_consistent_int() for RANGESTRAT_CONTAINED_BY was "return true" (ie, search the entire index) rather than range_overlaps, which is what is tested in the comparable case in rtree_internal_consistent(). The regression tests showed me how come: an empty-range index entry should be reported as being contained by anything, but range_overlaps won't necessarily find such entries. (The rtree case is all right because there is no equivalent of an empty range in boxes, circles, or polygons.) I am not satisfied with this state of affairs though: range_contained_by really ought to be efficiently indexable, but with the current coding an index search is nearly useless. Also, so far as I can see, the current penalty function allows empty-range entries to be scattered basically anywhere in the index, making a search for "= empty" pretty inefficient too. The first solution that comes to mind is to make the penalty and picksplit functions forcibly segregate empty ranges from others, that is a split will never put empty ranges together with non-empty ones. Then, we can assume that a non-empty internal node doesn't represent any empty leaf entries, and avoid descending to it when it doesn't overlap the target range. Then the equality-search case could be improved too. Thoughts, better ideas? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prep object creation hooks, and related sepgsql updates
Kohei KaiGai writes: > We still don't have clear direction of the way to implement external > permission > checks on object creation time. So, please consider these patches are on the > proof-of-concept stage; using prep-creation-hook to permission checks. I wonder if you could implement that as an extension given the command trigger patch finds its way in. What do you think? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Notes on implementing URI syntax for libpq
Excerpts from Peter Eisentraut's message of Thu Nov 24 22:05:09 +0200 2011: > > On tor, 2011-11-24 at 15:43 +0200, Alexander Shulgin wrote: > > Huh? The service definitions are read from a local pg_service.conf, > > and are specified by setting PGSERVICE (and PGSERVICEFILE) environment > > variables, no? > > > > What would you do with such URI if you need to other people to connect > > to the same service? Send them URI along with the pg_service.conf? > > A full URI would also rely on host names or IP addresses being the same > everywhere. It's all a matter of degree ... True, but it is much more reasonable to expect that hostnames will resolve to the same addresses most of the time (save for zone changes propagation time.) Still I can imagine where this may be useful, like local networks with shared pg_service.conf files. And since we don't need to do anything special to support the behavior (i.e. postgresql:///?service=foo is going to work out of the box,) this seems to be a non-problem. -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
3 items are attached: 1. A spreadsheet, results.ods, which has results for automated multiple runs of the "doubled up" dellstore orderlines queries (where orderlines is 385 MB). Results are sorted, with the median (or the lower middle value, didn't get the mean of the two middle runs) result for each run highlighted as representative. There are 3 builds of Postgres (HEAD, not inlined, optimized), each on its own sheet in the spreadsheet. The cache was warmed for each query, and subsequently the query was run 20 times. 2. A python script I hacked together that should run on anything around Python 2.5+, with a single dependency, psycopg2. Look at --help to see how it works. This is the script that actually generated the figures that went into the spreadsheet, by being run once for each type of build. You can fairly easily play along at home with this, for these and other queries. It will spit out CSV files. It is designed to warm the cache (by default 3 times before each query) to eliminate caching effects. You can add your own query to the python list to have it run by the script, to generate the same set of figures for that query. I'm rather curious as to how much of an impact this optimisation will have on queries with unique nodes, joins and grouping when they rely on a sort node for their input, in the real world. Certainly, a query need not have an ORDER BY clause to see benefits, perhaps substantial benefits. The logic to parse sort node details is rather unsophisticated right now though, due to my focus on the obvious ORDER BY case. 3. The revision of the patch that was actually tested, now with inlining specialisations for the single scanKey case, and a non-inlined specialisation for multiple scanKeys where we can still benefit from eliding the SQL function call machinery for the first scanKey, which is often almost as useful as being able to do so for all scanKeys. It also selects a sorting specialisation when it can in tuplesort_begin_heap, per Robert's suggestion. Reviewers will want to comment out line 731 of tuplesort.c, "#define optimize", to quickly get unoptimized behaviour for comparative purposes. Furthermore, if you'd like to see what this would look like without inlining, you can simply comment out assignments of inline variants of specialisations (i.e. int4inlqsort_arg and int8inlqsort_arg) in tuplesort_begin_heap. It has been suggested that I'm chasing diminishing returns by inlining, as I go further for a smaller benefit. Of course, that's true. However, I believe that I'm not chasing them past the point where that ceases to make sense, and these figures support that contention - chasing diminishing returns in the nature of this kind of work. Here, the additional benefit of inlining accounts for over an entire second shaved off a query that was originally 7056ms, so that's not to be sniffed at. I'll reiterate that qsort_arg has only been modified 4 times after its initial commit in 2006, and these were all trivial changes. While the way that I ape generic programming with the preprocessor is on the ugly side, what I've done can be much more easily understood with reference to qsort_arg itself. Robert said that duplicating the sort function was "iffy". However, that already happened long ago, as we're already maintaining both qsort_arg.c and qsort.c, and comments already urge maintainers to keep the two consistent. That's not been a problem though, because, as I've said, there has never been any call to make substantive changes to either in all those years. We could probably further reduce the code footprint of all of this by having template_qsort_arg.h generate comparators directly. That might be inflexible in a way that turns out to matter though, like if we wanted to use this for non-HAVE_INT64_TIMESTAMP timestamps and had to add NaN crud. My next step is to see how this goes with hundreds of sorts in the hundreds of megabytes on a high-end server. I don't have immediate access to one, but I'm working that out. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services From 92185762297ba9819eb7f1b208db03663d9fdfa8 Mon Sep 17 00:00:00 2001 From: peter2ndQuadrant Date: Sun, 20 Nov 2011 20:36:25 + Subject: [PATCH] Initial commit of optimization Stop directly using oid Added int8 quicksort fast path specialisation, which can also be used in place of F_TIMESTAMP_CMP for HAVE_INT64_TIMESTAMP builds. Rebased, revised patch for -hackers, with timestamp and int8 fast path sorting using the same int8 specialization. Remove unneeded line Rebase for -hackers Descriminate against cases where nKeys > 1 when selecting sort specialisation. We now support fast path sorting with multiple scanKeys. We now inline for one scanKey, but do not inline our comparetup_heap replacement when multiple scanKeys are used (although this is at the compiler's discretion). However, when multiple scanKeys are used, we still use a specialisati
Re: [HACKERS] Avoiding repeated snapshot computation
Pavan Deolasee writes: > On Sat, Nov 26, 2011 at 10:43 PM, Robert Haas wrote: >> Furthermore, it's >> hard to understand how this could be a net improvement in the general >> case, because now both B and F are copying everything twice (once to >> the shared area and one to backend-local memory) instead of just once >> (to backend-local memory) and C and D are sleeping (yikes!). > Yes, B and F pay a price of double copy. But I think it can be a net > saving because C and D (and many more hopefully) don't need to > recompute the snapshot again by going over a potentially large > ProcArray. Like Robert, I'm finding it hard to believe that this isn't going to be a net pessimization in as many or more cases as it'd be a win. Also, didn't we just get rid of most of the issue of "going over a large ProcArray" with the move of hot members to a separate array? In the end, getting a snapshot is exactly about copying information out of shared memory. Perhaps it would be more productive to think about how we could further modify the ProcArray representation to reduce the impedance mismatch between it and the snapshot representation, rather than introducing a second shared-memory copy of the same information. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
On Sat, Nov 26, 2011 at 10:43 PM, Robert Haas wrote: > On Sat, Nov 26, 2011 at 10:52 AM, Pavan Deolasee > wrote: >> What we can do is when a transaction comes to compute its snapshot, it >> checks if some other transaction is already computing a snapshot for >> itself. If so, it just sleeps on the lock. When the other process >> finishes computing the snapshot, it saves the snapshot is a shared >> area and wakes up all processes waiting for the snapshot. All those >> processes then just copy the snapshot from the shared area and they >> are done. This will not only reduce the total CPU consumption by >> avoiding repetitive work, but would also reduce the total time for >> which ProcArrayLock is held in SHARED mode by avoiding pipeline of >> GetSnapshotData calls. I am currently trying the shared work queue >> mechanism to implement this, but I am sure we can do it this in some >> other way too. > > I don't quite understand how this is going to work. I will try, keeping it simple. > Transaction A > ends, invaliding the shared snapshot. Now transactions B, C, and D > come along wishing to take snapshots. B begins taking a snapshot, so > C and D wait (blech!) Yeah, C and D waits. But thats better than concurrently doing the same computation. > for that to be complete. Then, they start > copying the snapshot. And they are holding the ProcArrayLock in shared mode. > Meanwhile, transaction E ends, invalidating the > shared snapshot, E can't end until B, C and D are done with copying the snapshot. > and then transaction F comes along, wanting to take a > snapshot. If F constructs a new shared snapshot, then doesn't that > overwrite the same memory area C and D were busy copying? It seems > like F needs some way to know that C and D are done with the old > shared snapshot before it starts writing a new one. Right. And that can easily be achieved by holding shared lock on ProcArray. The snapshot is invalidated by holding exclusive lock and set/copied while holding shared lock. I am assuming setting a boolean (valid/invalid) can safely be done with a shared lock. But I may be wrong. > Furthermore, it's > hard to understand how this could be a net improvement in the general > case, because now both B and F are copying everything twice (once to > the shared area and one to backend-local memory) instead of just once > (to backend-local memory) and C and D are sleeping (yikes!). Yes, B and F pay a price of double copy. But I think it can be a net saving because C and D (and many more hopefully) don't need to recompute the snapshot again by going over a potentially large ProcArray. So as I demonstrated in the illustration, the total time for which ProcArray lock is held will be significantly less with large number of clients. > That > could maybe be a gain at very high concurrency when spinlock > contention is eating us alive, but it doesn't seem like a good idea in > general. Maybe I'm missing something; did you mean to attach a patch? > I have a patch that I am attaching. It contains some unrelated changes (don't know why; may be I took a diff with some other branch), but you will get the idea. This needs improvement though because it just checks if a valid shared snapshot is available and copies that. If not, it goes ahead and computes one for itself. We need something more intelligent to know that a snapshot computation is in progress and we should wait instead of building our own. This patch had shown 15-20% improvement on the HP box that we are using for the benchmarks. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com shared-snapshot-v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Modify pg_dump to use error-free memory allocation macros. This
Bruce Momjian writes: > Basically, if we keep the existing functions in common.c, we are > deciding that it is never worth moving C functions to new C files for > source code clarity. I was not sure we had made that decision in the > past. I once held hope that git would let us be more flexible about this, but experimentation has shown that it's not significantly smarter than CVS about back-patches in moved code :-(. I don't want to say that we should *never* move code once it's released; but I think the bar for that needs to be set pretty high, and this particular case isn't clearing the bar for me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
On 10/05/2011 03:58 AM, Amit Khandekar wrote: On 5 October 2011 12:29, Alex Hunsaker wrote: Find it attached. [ Note I didn't put the check inside the if (ret == utf8_str) as it seemed a bit cleaner (indentation wise) to have it outside ] I have no more issues with the patch. Committed. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Modify pg_dump to use error-free memory allocation macros. This
Tom Lane wrote: > >> It is not possible to just link the old common.c into pg_restore and > >> pg_dumpall because it contains calls to pg_dump-only functions. > >> > >> Ideas? > > > The only other idea I have is to keep most functions in the mis-named > > common.c and move the memory functions into dumpmem.c and dumpmem.h and > > include that in the other C files, but that doesn't help with long-term > > code clarity. > > I don't think there is any "long-term code clarity" gain here that is > worth the maintenance headache. Please put the existing functions back > where they were. Inventing new files for the new code seems fine. Well, you do a lot more backpatching than I do, so I am willing to do as you request, but looking at the git history for common.c, I don't see a lot of backpatching for this file. Basically, if we keep the existing functions in common.c, we are deciding that it is never worth moving C functions to new C files for source code clarity. I was not sure we had made that decision in the past. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
On Sat, Nov 26, 2011 at 10:52 AM, Pavan Deolasee wrote: > What we can do is when a transaction comes to compute its snapshot, it > checks if some other transaction is already computing a snapshot for > itself. If so, it just sleeps on the lock. When the other process > finishes computing the snapshot, it saves the snapshot is a shared > area and wakes up all processes waiting for the snapshot. All those > processes then just copy the snapshot from the shared area and they > are done. This will not only reduce the total CPU consumption by > avoiding repetitive work, but would also reduce the total time for > which ProcArrayLock is held in SHARED mode by avoiding pipeline of > GetSnapshotData calls. I am currently trying the shared work queue > mechanism to implement this, but I am sure we can do it this in some > other way too. I don't quite understand how this is going to work. Transaction A ends, invaliding the shared snapshot. Now transactions B, C, and D come along wishing to take snapshots. B begins taking a snapshot, so C and D wait (blech!) for that to be complete. Then, they start copying the snapshot. Meanwhile, transaction E ends, invalidating the shared snapshot, and then transaction F comes along, wanting to take a snapshot. If F constructs a new shared snapshot, then doesn't that overwrite the same memory area C and D were busy copying? It seems like F needs some way to know that C and D are done with the old shared snapshot before it starts writing a new one. Furthermore, it's hard to understand how this could be a net improvement in the general case, because now both B and F are copying everything twice (once to the shared area and one to backend-local memory) instead of just once (to backend-local memory) and C and D are sleeping (yikes!). That could maybe be a gain at very high concurrency when spinlock contention is eating us alive, but it doesn't seem like a good idea in general. Maybe I'm missing something; did you mean to attach a patch? I had a different idea for avoiding repeated snapshot computation: save the snapshot in backend-private memory. When a process takes ProcArrayLock in exclusive mode, it increments a 64-byte counter. When a process takes ProcArrayLock in shared mode, it can check whether the counter has changed; if not, it can reuse the snapshot it took before. I think there might be away to tinker with the snapshot management so as to do this without any more copying than we do presently, since CurrentSnapshotData and SecondarySnapshotData are basically treated as scratch-pads anyhow. Another idea would be to try to avoid taking ProcArrayLock at all. For example, suppose we again have a 64-bit counter in shared memory. A transaction wishing to do ProcArrayEndTransaction() takes ProcArrayLock in exclusive mode, increments the counter, memory barrier, clears its XID, memory barrier, increments the counter, releases ProcArrayLock. A transaction wishing to take a snapshot first reads the counter. If the value read from the counter is even, then we continue like this: memory barrier, take snapshot, memory barrier, recheck counter. If we find that the counter is unchanged, then nobody did ProcArrayEndTransaction() while we were taking the snapshot, and we're good to go. If the counter changed then we take ProcArrayLock in shared mode and retake the snapshot. (We also take ProcArrayLock in shared mode if the initial value we read was odd.) This seems like it would all but eliminate contention on the spinlock protecting ProcArrayLock, but I'm not sure how much overhead it would add in other cases. In particular, if we have to retake the snapshot more than very occasionally, it's not going to be good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython SPI cursors
On 11-11-23 01:58 PM, Jan Urbański wrote: On 20/11/11 19:14, Steve Singer wrote: On 11-10-15 07:28 PM, Jan Urbański wrote: Hi, attached is a patch implementing the usage of SPI cursors in PL/Python. Currently when trying to process a large table in PL/Python you have slurp it all into memory (that's what plpy.execute does). J I found a few bugs (see my testing section below) that will need fixing + a few questions about the code Responding now to all questions and attaching a revised patch based on your comments. I've looked over the revised version of the patch and it now seems fine. Ready for committer. Do we like the name plpy.cursor or would we rather call it something like plpy.execute_cursor(...) or plpy.cursor_open(...) or plpy.create_cursor(...) Since we will be mostly stuck with the API once we release 9.2 this is worth some opinions on. I like cursor() but if anyone disagrees now is the time. We use plpy.subtransaction() to create Subxact objects, so I though plpy.cursor() would be most appropriate. This patch does not provide a wrapper around SPI_cursor_move. The patch is useful without that and I don't see anything that preculdes someone else adding that later if they see a need. My idea is to add keyword arguments to plpy.cursor() that will allow you to decide whether you want a scrollable cursor and after that provide a move() method. The patch includes documentation updates that describes the new feature. The Database Access page doesn't provide a API style list of database access functions like the plperl http://www.postgresql.org/docs/9.1/interactive/plperl-builtins.html page does. I think the organization of the perl page is clearer than the python one and we should think about a doing some documentaiton refactoring. That should be done as a seperate patch and shouldn't be a barrier to committing this one. Yeah, the PL/Python docs are a bit chaotic right now. I haven't yet summoned force to overhaul them. in PLy_cursor_plan line 4080 + PG_TRY(); + { + Portal portal; + char *volatile nulls; + volatile int j; I am probably not seeing a code path or misunderstanding something about the setjmp/longjump usages but I don't see why nulls and j need to be volatile here. It looked like you could drop volatile there (and in PLy_spi_execute_plan, where this is copied from (did I mention there's quite some code duplication in PL/Python?)) but digging in git I found this commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2789b7278c11785750dd9d2837856510ffc67000 that added the original volatile qualification, so I guess there's a reason. line 444 PLy_cursor(PyObject *self, PyObject *args) + { + char *query; + PyObject *plan; + PyObject *planargs = NULL; + + if (PyArg_ParseTuple(args, "s", &query)) + return PLy_cursor_query(query); + Should query be freed with PyMem_free() No, PyArg_ParseTuple returns a string on the stack, I check that repeatedly creating a cursor with a plan argument does not leak memory and that adding PyMem_Free there promptly leads to a segfault. I tested both python 2.6 and 3 on a Linux system [test cases demonstrating bugs] Turns out it's a really bad idea to store pointers to Portal structures, because they get invalidated by the subtransaction abort hooks. I switched to storing the cursor name and looking it up in the appropriate hash table every time it's used. The examples you sent (which I included as regression tests) now cause a ValueError to be raised with a message stating that the cursor has been created in an aborted subtransaction. Not sure about the wording of the error message, though. Thanks again for the review! Cheers, Jan
Re: [HACKERS] psql setenv command
On 11/20/2011 11:07 AM, Josh Kupershmidt wrote: On Wed, Nov 2, 2011 at 5:36 PM, Andrew Dunstan wrote: Updated patch is attached - adding to Nov commitfest. Review of the v2 patch: * Submission Review Patch applies with some fuzz and builds without warnings. I noticed some tab characters being used in psql-ref.sgml where they shouldn't be. Fixed. * Coding Review / Nitpicks The patch implements \setenv via calls to unsetenv() and putenv(). As the comment notes, | That means we | leak a bit of memory here, but not enough to worry about. which seems true; the man page basically says there's nothing to be done about the situation. The calls to putenv() and unsetenv() are done without any real input checking. So this admittedly-pathological case produces surprising results: test=# \setenv foo=bar baz test=# \! echo $foo bar=baz Perhaps 'envvar' arguments with a '=' in the argument should just be disallowed. Done that way. Also, should the malloc() of newval just use pg_malloc() instead? Yes, also done. Revised patch attached. cheers andrew diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 01f57c4..39193d5 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -2242,6 +2242,24 @@ lo_import 152801 +\setenv [ name [ value ] ] + + + +Sets the environment variable name to value, or if the +value is +not supplied, unsets the environment variable. Example: + +foo=> \setenv PAGER less +foo=> \setenv LESS -imx4F + + + + + + \sf[+] function_description diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 9cc73be..b5331aa 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1110,6 +1110,56 @@ exec_command(const char *cmd, free(opt0); } + + /* \setenv -- set environment command */ + else if (strcmp(cmd, "setenv") == 0) + { + char *envvar = psql_scan_slash_option(scan_state, + OT_NORMAL, NULL, false); + char *envval = psql_scan_slash_option(scan_state, + OT_NORMAL, NULL, false); + + if (!envvar) + { + psql_error("\\%s: missing required argument\n", cmd); + success = false; + } + else if (strchr(envvar,'=') != NULL) + { + psql_error("\\%s: environment variable name must not contain '='\n", + cmd); + success = false; + } + else if (!envval) + { + /* No argument - unset the environment variable */ + unsetenv(envvar); + success = true; + } + else + { + /* Set variable to the value of the next argument */ + int len = strlen(envvar) + strlen(envval) + 1; + char *newval = pg_malloc(len + 1); + + if (!newval) + { +psql_error("out of memory\n"); +exit(EXIT_FAILURE); + } + snprintf(newval, len+1, "%s=%s", envvar, envval); + putenv(newval); + success = true; + /* + * Do not free newval here, it will screw up the environment + * if you do. See putenv man page for details. That means we + * leak a bit of memory here, but not enough to worry about. + */ + } + free(envvar); + free(envval); + } + /* \sf -- show a function's source code */ else if (strcmp(cmd, "sf") == 0 || strcmp(cmd, "sf+") == 0) { diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index 4649e94..20fb5d7 100644 --- a/src/bin/psql/help.c +++ b/src/bin/psql/help.c @@ -158,7 +158,7 @@ slashUsage(unsigned short int pager) { FILE *output; - output = PageOutput(93, pager); + output = PageOutput(94, pager); /* if you add/remove a line here, change the row count above */ @@ -257,6 +257,7 @@ slashUsage(unsigned short int pager) fprintf(output, _("Operating System\n")); fprintf(output, _(" \\cd [DIR] change the current working directory\n")); + fprintf(output, _(" \\setenv NAME [VALUE] set environment variable, or unset it if no value provided\n")); fprintf(output, _(" \\timing [on|off] toggle timing of commands (currently %s)\n"), ON(pset.timing)); fprintf(output, _(" \\! [COMMAND] execute command in shell or start interactive shell\n")); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Modify pg_dump to use error-free memory allocation macros. This
Bruce Momjian writes: > Bruce Momjian wrote: >> Tom Lane wrote: >>> object to arbitrarily moving a bunch of longstanding code from one file >>> to another. What that is mainly going to accomplish is creating a big >>> headache whenever we have to back-patch fixes that touch that code >>> ... and what exactly did it buy in return? >> Yes, I didn't like that either. The problem was that common.c was setup >> to share code between pg_dump and a long-forgotten tool for Postgres 4.X >> called pg4_dump (yes, pre-1996). That code that was moved was really >> not "common" in any current sense because it was used only by pg_dump >> (not by pg_restore or pg_dumpall), so I moved it into dumpcatalog.c, and >> put the really "common" code into common.c. (We could call it dumpmem.c >> or something.) >> >> Now, one approach would have been to rename common.c to dumpcatalog.c in >> git, then create a new common.c, but that seems quite confusing to >> people trying to reconstruct the history. That will not help. git is not smart enough to deal with back-patching changes in code that has moved from one file to another. If you force this through it will mean manual adjustment of every back-patchable fix that has to touch this code, for the next five or six years. Yes, the name "common.c" is a bit of a misnomer now, but that doesn't mean we have to change it, especially since the file's header comment already explained what it was common for (and could be modified some more if you don't find that adequate). >> It is not possible to just link the old common.c into pg_restore and >> pg_dumpall because it contains calls to pg_dump-only functions. >> >> Ideas? > The only other idea I have is to keep most functions in the mis-named > common.c and move the memory functions into dumpmem.c and dumpmem.h and > include that in the other C files, but that doesn't help with long-term > code clarity. I don't think there is any "long-term code clarity" gain here that is worth the maintenance headache. Please put the existing functions back where they were. Inventing new files for the new code seems fine. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Avoiding repeated snapshot computation
On some recent benchmarks and profile data, I saw GetSnapshotData figures at the very top or near top. For lesser number of clients, it can account for 10-20% of time, but more number of clients I have seen it taking up as much as 40% of sample time. Unfortunately, the machine of which I was running these tests is currently not available and so I don't have the exact numbers. But the observation is almost correct. Our recent work on separating the hot members of PGPROC in a separate array would definitely reduce data cache misses ans reduce the GetSnapshotData time, but it probably still accounts for a large enough critical section for a highly contended lock. I think now that we have reduced the run time of the function itself, we should now try to reduce the number of times the function is called. Robert proposed a way to reduce the number of calls per transaction. I think we can go one more step further and reduce the number for across the transactions. One major problem today could be because the way LWLock works. If the lock is currently held in SHARED mode by some backend and some other backend now requests it in SHARED mode, it will immediately get it. Thats probably the right thing to do because you don't want the reader to really wait when the lock is readily available. But in the case of GetSnapshotData(), every reader is doing exactly the same thing; they are computing a snapshot based on the same shared state and would compute exactly the same snapshot (if we ignore the fact that we don't include caller's XID in xip array, but thats a minor detail). And because the way LWLock works, more and more readers would get in to compute the snapshot, until the exclusive waiters get a window to sneak in, either because more and more processes slowly start sleeping for exclusive access. To depict it, the four transactions make overlapping calls for GetSnapshotData() and hence the total critical section starts when the first caller enters it and the ends when the last caller exits. Txn1 --[ SHARED]- Txn2 [ SHARED]--- Txn3 -[SHARED]- Txn4 ---[ SHARED ]- |<---Total Time >| Couple of ideas come to mind to solve this issue. A snapshot once computed will remain valid for every call irrespective of its origin until at least one transaction ends. So we can store the last computed snapshot in some shared area and reuse it for all subsequent GetSnapshotData calls. The shared snapshot will get invalidated when some transaction ends by calling ProcArrayEndTransaction(). I tried this approach and saw a 15% improvement for 32-80 clients on the 32 core HP IA box with pgbench -s 100 -N tests. Not bad, but I think this can be improved further. What we can do is when a transaction comes to compute its snapshot, it checks if some other transaction is already computing a snapshot for itself. If so, it just sleeps on the lock. When the other process finishes computing the snapshot, it saves the snapshot is a shared area and wakes up all processes waiting for the snapshot. All those processes then just copy the snapshot from the shared area and they are done. This will not only reduce the total CPU consumption by avoiding repetitive work, but would also reduce the total time for which ProcArrayLock is held in SHARED mode by avoiding pipeline of GetSnapshotData calls. I am currently trying the shared work queue mechanism to implement this, but I am sure we can do it this in some other way too. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] vpath builds and verbose error messages
Peter Eisentraut writes: > On fre, 2011-11-18 at 09:44 -0500, Tom Lane wrote: >> It wouldn't be that hard for elog.c to do strrchr(fname, '/') or >> something like that, > Here is a patch for that. I would also like to backpatch this. Hmmm ... is it possible that strrchr could change errno? If so we'd need to re-order the operations a bit. Otherwise this seems fine. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal : backend startup hook / after logon trigger
Dne 25.11.2011 17:48, Tom Lane napsal(a): > "Tomas Vondra" writes: >> On 25 Listopad 2011, 2:44, Robert Haas wrote: >>> I've thought of this before, but I'm not exactly clear on what the use >>> cases are. > >> The startup hook is useful for initializing an extension written in C, >> when the extension was loaded from postgresql.conf. If you need to perform >> the initialization for each db separately (so that you can decide whether >> to apply the extension to the user/database), you need to do that after >> the backend starts. > > If you need that, just load the extension with local_preload_libraries. I can't do that. I have an extension that needs to be loaded from shared_preload_libraries (because it needs to ask for space in shared memory segment), but I need to perform some additional initialization for each backend. Right now I solve that with an 'initialized' variable that's set to false, and I have to check that before each action (and perform the init if it's false). An after-startup hook would be much cleaner. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql \ir filename normalization
Bruce Momjian wrote: > Robert Haas wrote: > > On Mon, Nov 21, 2011 at 2:30 PM, Bruce Momjian wrote: > > > Robert Haas wrote: > > >> On Mon, Nov 21, 2011 at 1:05 PM, Bruce Momjian wrote: > > >> > Robert Haas wrote: > > >> >> Argh. ?The root of the problem here seems to be that > > >> >> join_path_components() feels entitled to arbitrarily insert a pathname > > >> >> separator at the front of the output string even if its first input > > >> >> didn't begin with one originally. ?Lame! > > >> > > > >> > The attached patch fixes this report, I think. > > >> > > >> Looks sensible. ?Keep in mind we need to back-patch this. > > > > > > Oh. ?Well, with no bug reports about it, does that make sense? ?Do we > > > have any code that relies on the old behavior? > > > > Oh, wait a minute. I was thinking \ir was in 9.1, but it's not: it > > was committed after the branch. So I guess this only needs to be > > fixed in master, which is much less scary. > > Agreed. I realize it is wrong but I have no idea what impact fixing it > in back branches might have, or people who are relying on the broken > behavior in some way. Patch applied. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] vpath builds and verbose error messages
On fre, 2011-11-18 at 09:44 -0500, Tom Lane wrote: > It wouldn't be that hard for elog.c to do strrchr(fname, '/') or > something like that, Here is a patch for that. I would also like to backpatch this. diff --git i/src/backend/utils/error/elog.c w/src/backend/utils/error/elog.c index 9a99fc7..0131c06 100644 --- i/src/backend/utils/error/elog.c +++ w/src/backend/utils/error/elog.c @@ -343,7 +343,14 @@ errstart(int elevel, const char *filename, int lineno, edata->elevel = elevel; edata->output_to_server = output_to_server; edata->output_to_client = output_to_client; - edata->filename = filename; + if (filename) + { + const char *slash; + + /* keep only base name, useful especially for vpath builds */ + slash = strrchr(filename, '/'); + edata->filename = slash ? slash + 1 : filename; + } edata->lineno = lineno; edata->funcname = funcname; /* the default text domain is the backend's */ @@ -1142,7 +1149,14 @@ elog_start(const char *filename, int lineno, const char *funcname) } edata = &errordata[errordata_stack_depth]; - edata->filename = filename; + if (filename) + { + const char *slash; + + /* keep only base name, useful especially for vpath builds */ + slash = strrchr(filename, '/'); + edata->filename = slash ? slash + 1 : filename; + } edata->lineno = lineno; edata->funcname = funcname; /* errno is saved now so that error parameter eval can't change it */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inserting heap tuples in bulk in COPY
On 25.11.2011 23:32, Jeff Janes wrote: On Fri, Nov 25, 2011 at 12:53 PM, Jeff Janes wrote: Thanks for this patch. Doing bulk copies in parallel for me is now limited by the IO subsystem rather than the CPU. This patch, commit number d326d9e8ea1d69, causes fillfactor to be ignored for the copy command. Is this acceptable collateral damage? Having looked into it a little bit, I think this might be an acceptable fix. Thanks, applied. It was an oversight. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers