Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
> > Linas, could you capture the output of pg_controldata *and* increase the > > log level to DEBUG1 on the standby? We should then see nextXid value of > > the checkpoint the recovery is starting from. > > I'll try to do that whenever I'm in that territory again... Incidentally, > recently there was a lot of unrelated-to-this-post work to polish things up > for a talk being given at PGWest 2011 Today :) > > > I also checked what rsync does when a file vanishes after rsync computed the > > file list, but before it is sent. rsync 3.0.7 on OSX, at least, complains > > loudly, and doesn't sync the file. It BTW also exits non-zero, with a special > > exit code for precisely that failure case. > > To be precise, my script has logic to accept the exit code 24, just as > stated in PG manual: > > Docs> For example, some versions of rsync return a separate exit code for > Docs> "vanished source files", and you can write a driver script to accept > Docs> this exit code as a non-error case. I also am running into this issue and can reproduce it very reliably. For me, however, it happens even when doing the "fast backup" like so: pg_start_backup('whatever', true)...my traffic is more write-heavy than linas's tho, so that might have something to do with it. Yesterday it reliably errored out on pg_clog every time, but today it is failing sporadically on pg_subtrans (which seems to be past where the pg_clog error was)the only thing that has changed is that I've changed the log level to debug1I wouldn't think that could be related though. I've linked the requested pg_controldata and debug1 logs for both errors. Both links contain the output from pg_start_backup, rsync, pg_stop_backup, pg_controldata, and then the postgres debug1 log produced from a subsequent startup attempt. pg_clog: http://pastebin.com/mTfdcjwH pg_subtrans: http://pastebin.com/qAXEHAQt Any workarounds would be very appreciated.would copying clog+subtrans before or after the rest of the data directory (or something like that) make any difference? Thanks!
[HACKERS] plpython SPI cursors
Hi, attached is a patch implementing the usage of SPI cursors in PL/Python. Currently when trying to process a large table in PL/Python you have slurp it all into memory (that's what plpy.execute does). This patch allows reading the result set in smaller chunks, using a SPI cursor behind the scenes. Example usage: cursor = plpy.cursor("select a, b from hugetable") for row in cursor: plpy.info("a is %s and b is %s" % (row['a'], row['b'])) The patch itself is simple, but there's a lot of boilerplate dedicated to opening a subtransaction and handling prepared plans. I'd like to do some refactoring of they way PL/Python uses SPI to reduce the amount of boilerplate needed, but that'll come as a separate patch (just before the patch to split plpython.c in smaller chunks). This feature has been sponsored by Nomao. Cheers, Jan PS: I already added it to the November CF. J >From 9ad14957e7b4ae19667df3bb8cc2aa5ef5bf96c8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jan=20Urba=C5=84ski?= Date: Tue, 13 Sep 2011 14:42:41 +0200 Subject: [PATCH] Add cursor support to plpythonu. Exposes SPI cursors as plpythonu objects allowing processing large result sets without loading them entirely into memory, as plpy.execute is doing. --- doc/src/sgml/plpython.sgml | 80 src/pl/plpython/expected/plpython_spi.out | 151 +++ src/pl/plpython/expected/plpython_test.out |6 +- src/pl/plpython/plpython.c | 605 src/pl/plpython/sql/plpython_spi.sql | 116 ++ 5 files changed, 955 insertions(+), 3 deletions(-) diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml index eda2bbf..d08c3d1 100644 *** a/doc/src/sgml/plpython.sgml --- b/doc/src/sgml/plpython.sgml *** $$ LANGUAGE plpythonu; *** 892,897 --- 892,906 + Note that calling plpy.execute will cause the entire + result set to be read into memory. Only use that function when you are sure + that the result set will be relatively small. If you don't want to risk + excessive memory usage when fetching large results, + use plpy.cursor rather + than plpy.execute. + + + For example: rv = plpy.execute("SELECT * FROM my_table", 5) *** $$ LANGUAGE plpythonu; *** 958,963 --- 967,1043 + + Accessing data with cursors + + + The plpy.cursor function accepts the same arguments + as plpy.execute (except for limit) + and returns a cursor object, which allows you to process large result sets + in smaller chunks. As with plpy.execute, either a query + string or a plan object along with a list of arguments can be used. The + cursor object provides a fetch method that accepts an + integer paramter and returns a result object. Each time you + call fetch, the returned object will contain the next + batch of rows, never larger than the parameter value. Once all rows are + exhausted, fetch starts returning an empty result + object. Cursor objects also provide an + http://docs.python.org/library/stdtypes.html#iterator-types";>iterator + interface, yielding one row at a time until all rows are exhausted. + Data fetched that way is not returned as result objects, but rather as + dictionaries, each dictionary corresponding to a single result row. + + + + Cursors are automatically disposed of, but if you want to explicitly + release all resources held by a cursor, use the close + method. Once closed, a cursor cannot be fetched from anymore. + + + + + Do not confuse objects created by plpy.cursor with + DBAPI cursors as defined by + the http://www.python.org/dev/peps/pep-0249/";>Python Database API specification. + They don't have anything in common except for the name. + + + + + An example of two ways of processing data from a large table would be: + + CREATE FUNCTION count_odd_iterator() RETURNS integer AS $$ + odd = 0 + for row in plpy.cursor("select num from largetable"): + if row['num'] % 2: + odd += 1 + return odd + $$ LANGUAGE plpythonu; + + CREATE FUNCTION count_odd_fetch(batch_size integer) RETURNS integer AS $$ + odd = 0 + cursor = plpy.cursor("select num from largetable") + while True: + rows = cursor.fetch(batch_size) + if not rows: + break + for row in rows: + if row['num'] % 2: + odd += 1 + return odd + $$ LANGUAGE plpythonu; + + CREATE FUNCTION count_odd_prepared() RETURNS integer AS $$ + odd = 0 + plan = plpy.prepare("select num from largetable where num % $1 != 0", ["integer"]) + rows = list(plpy.cursor(plan, [2])) + + return len(rows) + $$ LANGUAGE plpythonu; + + + + Trapping Errors diff --git a/src/pl/plpython/expected/plpython_spi.out b/src/pl/plpython/expected/plpython_spi.out index 7f4ae5c..3b4d7a3 100644 *** a/src/pl/plpython/expected/plpython_spi.out --
Re: [HACKERS] COUNT(*) and index-only scans
On Mon, Oct 10, 2011 at 9:48 PM, Kevin Grittner wrote: >> Jeff Janes wrote: >> Kevin Grittner wrote: > >>> create table t (id int not null primary key); >>> insert into t select generate_series(1, 100); >>> vacuum freeze analyze; >>> explain analyze select count(*) from t >>> where id between 50 and 500010; >>> >>> That gives you an index-only scan; but without the WHERE clause it >>> uses a seq scan. >> >> If you convert the where clause to "where id is not null" it uses >> the index only scan again, but only if you nudge it too with >> enable_seqscan=off. With a recent commit from (I assume) Tom, the "where id is not null" is no longer needed. > Clever way to get a full-table test. > > It turns out that for the above, with your trick to use the index > only scan, it comes out 12% faster to do a seqscan, even when the > table and index are fully cached (based on the average time of ten > runs each way). There's very little overlap, so the difference looks > real. But that's on a very narrow record, having just the one column > used in the index. I added one wide column like this: > > alter table t add column x text; > update t set x = (repeat(random()::text, (random() * 100)::int)); > cluster t USING t_pkey; > vacuum freeze analyze; > > With that change the index-only scan time remained unchanged, while > the seqscan time grew to about 2.6 times the index only scan time. > That was mildly surprising for me, considering it was all still > cached. I used the pgbench_accounts table from pgbench -i -s 50, where all data fits in shared_buffers, using the -f switch with either set enable_seqscan=off; select count(*) from pgbench_accounts; or set enable_indexonlyscan=off; select count(*) from pgbench_accounts; With just a single client, it was a toss-up. But with 8 concurrent clients on a 8 CPU machine, the index-only scan was 50% faster. So that is a nice win, even if well-designed apps probably shouldn't be endlessly counting rows of an unchanging table using all available CPUs in the first place. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
On Sat, Oct 15, 2011 at 02:58:45PM -0400, Tom Lane wrote: > [algorithm for a regular index scan satisfying "key IN (...)"] > So, at least as far as btrees are concerned, it seems like I implemented > the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed > down into the index AM. The above rules seem pretty btree-specific, so > I don't think that we ought to have the main executor doing any of this. > I envision doing this by marking btree as supporting ScalarArrayOpExpr > scankeys directly, so that the executor does nothing special with them, > and the planner treats them the same as regular scalar indexquals. Sounds sensible. The algorithm applies to more than ScalarArrayOpExpr; is it not the ability to handle an OR'ed list of ScanKey instead of an AND'ed one? Would it be worth exposing the capability along those lines instead, even if the only initial consumer is ScalarArrayOpExpr? > In principle somebody could be doing something like > WHERE pointcol <@ ANY (ARRAY[list of box values]) > and expecting that to generate a bitmap indexscan on a GIST index, but > is it likely that anyone is doing that? (As opposed to the equivalent > formulation with "pointcol <@ box1 OR pointcol <@ box2 ...", which would > still work to generate OR'd bitmap scans even if we took out the > ScalarArrayOpExpr logic.) Removing the "key IN (...)" optimization for hash indexes would add one more barrier to making that access method practical. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
On Sat, Oct 15, 2011 at 2:16 PM, Jeff Janes wrote: > On Fri, Oct 7, 2011 at 11:40 AM, Tom Lane wrote: >> Robert Haas writes: >>> Please find attached a patch implementing a basic version of >>> index-only scans. >> >> I'm making some progress with this, but I notice what seems like a >> missing feature: there needs to be a way to turn it off. Otherwise >> performance comparisons will be difficult to impossible. >> >> The most obvious solution is a planner control GUC, perhaps >> "enable_indexonlyscan". Anyone object, or want to bikeshed the name? >> > > Currently I can't force an indexonlyscan over an indexscan, because of > the way the enable_* variables work. OK, scratch that. I must have been using the wrong query (for which the index was not covering), as I can't reproduce the behavior nor looking at the code can I see how it could have occurred in the first place. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
On Sat, Oct 15, 2011 at 2:58 PM, Tom Lane wrote: > In principle somebody could be doing something like > WHERE pointcol <@ ANY (ARRAY[list of box values]) > and expecting that to generate a bitmap indexscan on a GIST index, but > is it likely that anyone is doing that? (As opposed to the equivalent > formulation with "pointcol <@ box1 OR pointcol <@ box2 ...", which would > still work to generate OR'd bitmap scans even if we took out the > ScalarArrayOpExpr logic.) That seems like a pretty natural formulation to me, so I would be rather reluctant to assume nobody's doing it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
On Fri, Oct 7, 2011 at 11:40 AM, Tom Lane wrote: > Robert Haas writes: >> Please find attached a patch implementing a basic version of >> index-only scans. > > I'm making some progress with this, but I notice what seems like a > missing feature: there needs to be a way to turn it off. Otherwise > performance comparisons will be difficult to impossible. > > The most obvious solution is a planner control GUC, perhaps > "enable_indexonlyscan". Anyone object, or want to bikeshed the name? > Currently I can't force an indexonlyscan over an indexscan, because of the way the enable_* variables work. Should setting enable_indexscan=off also disable index-only scans (the current behavior) in addition to plain index scans? By way of precedent, enable_indexscan=off does not disable Bitmap Index Scan. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Call stacks and RAISE INFO
> No, I don't think so. The use-case for this sort of thing seems to me > to be messages that are directed to the user or DBA, and don't want to > be decorated with a lot of information about where they came from. > That determination is usually pretty clear when you write the code. For my case, I agree with Tom. For example, in my recent debugging session, I was debugging a recursive function ... one which calls itself, up to 6 levels deep. For that function, I want to turn context off because there's so much it becomes unreadable, and instead I put a nesting counter in the INFO. I don't want to turn of context for other functions universally -- even in the same ETL session -- because I want to know what called them, since some of them can be called on multiple paths. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
Almost immediately after we committed index-only scans, there were complaints that it didn't work with "indexkey IN (...)" conditions, that is ScalarArrayOpExpr index quals. That's because the current code only supports ScalarArrayOpExpr as a bitmap indexscan qual, not a regular indexscan. The executor performs a separate indexscan for each element of the array, relying on the bitmap mechanism to eliminate duplicate hits on the same heap tuple. In principle it's not hard to see how ScalarArrayOpExpr could be supported as a regular indexqual for btree indexes. If the comparison operator has <, <=, >=, or > semantics, then the array condition is equivalent to a simple scalar condition using the greatest or least array element respectively. Otherwise (i.e., the operator is =): 1. Sort the array elements into the same order as the btree, eliminating duplicates and NULL entries. (If there are no non-null elements, the qual condition is unsatisfiable and we can skip the indexscan.) 2. Perform an indexscan for each remaining array element, in sequence. Since we eliminated equal values in step 1, there can be no two matches to the same index entry, so there's no need for duplicate elimination. What's more, because we sorted the array to match the index order, index entries will be matched in index order, so the results of the scan can still be expected to come out in sorted order, a critical expectation for btree indexscans. If we have more than one ScalarArrayOpExpr then we have to be a bit careful about how we perform the multiple indexscans to ensure that output ordering is preserved, but it's certainly doable. So, at least as far as btrees are concerned, it seems like I implemented the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed down into the index AM. The above rules seem pretty btree-specific, so I don't think that we ought to have the main executor doing any of this. I envision doing this by marking btree as supporting ScalarArrayOpExpr scankeys directly, so that the executor does nothing special with them, and the planner treats them the same as regular scalar indexquals. So that leaves me wondering whether the existing executor-level implementation of ScalarArrayOpExpr indexquals ought to be ripped out entirely. It would not save all that much code in the executor proper (basically ExecIndexEvalArrayKeys, ExecIndexAdvanceArrayKeys, and a few lines of supporting logic). However, there's a fair amount of cruft in the planner to deal with the concept that ScalarArrayOpExpr is supported as a bitmap indexscan qual but not a plain indexscan qual. If that code is only going to get exercised for non-btree index types, it's likely to be under-tested and hence a continuing source of bugs. In principle somebody could be doing something like WHERE pointcol <@ ANY (ARRAY[list of box values]) and expecting that to generate a bitmap indexscan on a GIST index, but is it likely that anyone is doing that? (As opposed to the equivalent formulation with "pointcol <@ box1 OR pointcol <@ box2 ...", which would still work to generate OR'd bitmap scans even if we took out the ScalarArrayOpExpr logic.) Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Call stacks and RAISE INFO
2011/10/15 Tom Lane : > Pavel Stehule writes: >> 2011/10/15 Robert Haas : >>> On Sat, Oct 15, 2011 at 12:24 AM, Pavel Stehule >>> wrote: I have no problem with this. A context can be false for info and true for other in default. Please, use a different identifier than "context", that can be use for reading context in future - maybe "attach_context" or some similar. > >>> error_context? > >> what about show_context, hide_context, hold_context, use_context ?? > > I still think it should be CONTEXT, period. All the other options to > RAISE are named directly after the message lines they control; why > should this one be different? I had a idea to set CONTEXT from RAISE statement - for forwarding data from handled exception some like BEGIN ... EXCEPTION WHEN ... GET DIAGNOSTICS _context = PG_EXCEPTION_CONTEXT; REISE USING context = _context; END; Regards Pavel Stehule > > regards, tom lane > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: .psqlrc version dependence (was Re: [HACKERS] about EDITOR_LINENUMBER_SWITCH)
Tom Lane wrote: > Peter Eisentraut writes: > > Contrary to what the subject suggests, I think the main reason people > > wanted this feature was to be able to set the linestyle to unicode > > without getting a warning from older releases about unknown linestyle or > > something. But in a few years, they'll have to > > maintain .psqlrc-9.2, .psqlrc-9.3, .psqlrc-9.4, etc. That doesn't sound > > like a useful long-term solution either. > > Well, "in a few years" they won't need that conditionality any more at > all, so I'm not sure I believe the above argument. The problem seems > inherently self-limiting. > > What struck me while looking at the patch is that it is conditional > based on *psql's* version. Not the version of the server you're > connected to. I'm not too sure what use-cases people have for version > dependence here, but I'd think that the server version would enter into > it sometimes. The assumption is that the .psqlrc is controlling psql behavior. Not sure what setting would be changed based on server version, maybe psql variables. I have updated the docs to indicate it is the psql version. > (Of course, for server version to be used usefully, you'd need to > re-execute the rc file during \c, something we don't do now.) Yep, yuck. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
.psqlrc version dependence (was Re: [HACKERS] about EDITOR_LINENUMBER_SWITCH)
Peter Eisentraut writes: > Contrary to what the subject suggests, I think the main reason people > wanted this feature was to be able to set the linestyle to unicode > without getting a warning from older releases about unknown linestyle or > something. But in a few years, they'll have to > maintain .psqlrc-9.2, .psqlrc-9.3, .psqlrc-9.4, etc. That doesn't sound > like a useful long-term solution either. Well, "in a few years" they won't need that conditionality any more at all, so I'm not sure I believe the above argument. The problem seems inherently self-limiting. What struck me while looking at the patch is that it is conditional based on *psql's* version. Not the version of the server you're connected to. I'm not too sure what use-cases people have for version dependence here, but I'd think that the server version would enter into it sometimes. (Of course, for server version to be used usefully, you'd need to re-execute the rc file during \c, something we don't do now.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump vs malloc
On Fri, Oct 14, 2011 at 21:11, Bruce Momjian wrote: > Magnus Hagander wrote: >> On Wed, Jun 22, 2011 at 17:48, Tom Lane wrote: >> > Magnus Hagander writes: >> >> Something along the line of this? >> > >> > I think this is a seriously, seriously bad idea: >> > >> >> +#define strdup(x) pg_strdup(x) >> >> +#define malloc(x) pg_malloc(x) >> >> +#define calloc(x,y) pg_calloc(x, y) >> >> +#define realloc(x,y) pg_realloc(x, y) >> > >> > as it will render the code unreadable to people expecting the normal >> > behavior of these fundamental functions; not to mention break any >> > call sites that have some other means of dealing with an alloc failure >> > besides going belly-up. ?Please take the trouble to do >> > s/malloc/pg_malloc/g and so on, instead. >> >> Ok, I'll try that approach. This seemed like a "nicer" approach, but I >> think once written out, i agree with your arguments :-) > > Where are we on this? It's still sitting on my personal TODO list, just not with a really high priority. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: collect frequency statistics for arrays
Hi! Thanks for your attention to my patch! On Sat, Oct 15, 2011 at 2:47 PM, Nathan Boley wrote: > > Looking now, I see that Alexander wasn't Cc'd on the review, so it's > > possible he missed the message? > > > > We've corresponded off list and have discussed my review at some length. > > Alex submitted an updated patch on Sep 22 to me personally ( although > not to the list? Alex? ), with the promise of a new version with > improved comments. > Oh, I didn't noticed that I've posted updated patch off-list. So, there is repost of that version of patch. List of changes is below: 1) Distinct slot is used for length histogram. 2) Standard statistics is collected for arrays. 3) Most common values and most common elements are mapped to distinct columns of pg_stats view, because both of them are calculated for arrays. Now, it's hard for me to give to it as much time as I would like to. But I hope to present improved comments and testing until end of october. -- With best regards, Alexander Korotkov. arrayanalyze-0.5.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Call stacks and RAISE INFO
Pavel Stehule writes: > 2011/10/15 Robert Haas : >> On Sat, Oct 15, 2011 at 12:24 AM, Pavel Stehule >> wrote: >>> I have no problem with this. Â A context can be false for info and true >>> for other in default. Please, use a different identifier than >>> "context", that can be use for reading context in future - maybe >>> "attach_context" or some similar. >> error_context? > what about show_context, hide_context, hold_context, use_context ?? I still think it should be CONTEXT, period. All the other options to RAISE are named directly after the message lines they control; why should this one be different? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about EDITOR_LINENUMBER_SWITCH
Andrew Dunstan wrote: > > > On 10/15/2011 09:37 AM, Peter Eisentraut wrote: > > On tor, 2011-10-13 at 11:31 -0400, Bruce Momjian wrote: > >> The attached patch changes this to use the _major_ version number for > >> psql rc files. Does this have to be backward-compatible? Should I > >> check for minor and major matches? That is going to be confusing to > >> document. > > Contrary to what the subject suggests, I think the main reason people > > wanted this feature was to be able to set the linestyle to unicode > > without getting a warning from older releases about unknown linestyle or > > something. But in a few years, they'll have to > > maintain .psqlrc-9.2, .psqlrc-9.3, .psqlrc-9.4, etc. That doesn't sound > > like a useful long-term solution either. > > > > > > Wouldn't it be better to support some conditional syntax? I suppose if we add that to psql we can remove this factility completely. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about EDITOR_LINENUMBER_SWITCH
Peter Eisentraut wrote: > On tor, 2011-10-13 at 11:31 -0400, Bruce Momjian wrote: > > The attached patch changes this to use the _major_ version number for > > psql rc files. Does this have to be backward-compatible? Should I > > check for minor and major matches? That is going to be confusing to > > document. > > Contrary to what the subject suggests, I think the main reason people > wanted this feature was to be able to set the linestyle to unicode > without getting a warning from older releases about unknown linestyle or > something. But in a few years, they'll have to > maintain .psqlrc-9.2, .psqlrc-9.3, .psqlrc-9.4, etc. That doesn't sound > like a useful long-term solution either. Well, frankly, I think the fact we were matching on minor version number was even worse. This is slightly better. I guess they could use symlinks to keep a config file for multiple versions, but I agree it isn't a long-term great solution. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about EDITOR_LINENUMBER_SWITCH
On 10/15/2011 09:37 AM, Peter Eisentraut wrote: On tor, 2011-10-13 at 11:31 -0400, Bruce Momjian wrote: The attached patch changes this to use the _major_ version number for psql rc files. Does this have to be backward-compatible? Should I check for minor and major matches? That is going to be confusing to document. Contrary to what the subject suggests, I think the main reason people wanted this feature was to be able to set the linestyle to unicode without getting a warning from older releases about unknown linestyle or something. But in a few years, they'll have to maintain .psqlrc-9.2, .psqlrc-9.3, .psqlrc-9.4, etc. That doesn't sound like a useful long-term solution either. Wouldn't it be better to support some conditional syntax? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about EDITOR_LINENUMBER_SWITCH
On tor, 2011-10-13 at 11:31 -0400, Bruce Momjian wrote: > The attached patch changes this to use the _major_ version number for > psql rc files. Does this have to be backward-compatible? Should I > check for minor and major matches? That is going to be confusing to > document. Contrary to what the subject suggests, I think the main reason people wanted this feature was to be able to set the linestyle to unicode without getting a warning from older releases about unknown linestyle or something. But in a few years, they'll have to maintain .psqlrc-9.2, .psqlrc-9.3, .psqlrc-9.4, etc. That doesn't sound like a useful long-term solution either. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LIMITing number of results in a VIEW with global variables
Hello Florian, It seems dangerous for a cast to modify global state such a MU. The > evaluation > order of functions during query execute isn't always easy to guess, and may > change depending on the execution plan. I supposed that fuzzy2bool is called just before the terminal evaluation of the WHERE clause which needs a Boolean. My first tests showed that this hypothesis is right but it might be wrong in the case of alternative execution plans. > > With this implicit cast, the query > > SELECT age, young(age) FROM set_alpha(0.1), employees WHERE young(age); > > is equivalent to > > SELECT age, young(age) FROM set_alpha(0.1), employees WHERE > > fuzzy2bool(young(age)); > > Those set_alpha() calls seem equally dangerous. If this alpha is supposed > to be a global parameter, why not set it *before* issuing the query? Alternatively, we could also set the alpha value before the query : SELECT set_alpha(0.1); SELECT age, young(age) FROM employees WHERE young(age); I would be very interested to know if there is smarter way to set global variables. > I can sort the results in the view 'sorted_employees' according to > > value MU of a fuzzy predicate thanks to fuzzy2bool cast function. > > > > CREATE OR REPLACE VIEW sorted_employees AS > >SELECT *, get_mu() as mu > >FROM employees > >ORDER BY mu DESC; > > Are you aware that an ORDER BY clause in a VIEW is only going to work > if you do "SELECT .. .FROM view". I It really the first time I am using views, I didn't know that ! > If the outer query is more complex > than that, I wouldn't bet on the results coming back in the expected order. > I don't mind if the ordering is wrong : it is just a way to process filtering according to K and ALPHA. Usually, you'd attach ORDER BY to the outermost query (or to subqueries > containing a LIMIT clause). > > > The following query > > SELECT age, mu FROM set_k(5), set_alpha(0.1), filtered_employees > > Again, you seem to rely on these set_k(), set_alpha() calls being > done before the query is executed. But postgres doesn't guarantee > that. > > > WHERE young(age); > > gives the results : > > age mu > > 24 1 > > 16 1 > > instead of : > > age mu > > 16 1 > > 21 0.89976158142 > > 24 0.60023841858 > > 26 0.40005960464 > > 26 0.40005960464 > > > > It seems that the 'LIMIT K' instruction have side effects on the MU > value. > > The execution plan may very well vary depending on the limit. Off-hand, > I'd guess that with a small K, one of these set_whatever() calls in one > of your FROM clauses gets executed after the computation it'd supposed > to affect has already happened. > > > Why is it not working ? How to fix this issue ? > > Don't rely on the execution order of function calls in a SELECT statement. > Divide your functions into two classes. > > The ones which have side-effects (i.e. change global state). These should > *never* be called from SQL statements, except in the trivial case of > "SELECT my_func(...);". Also, they should be marked with VOLATILE > > And the ones without side effects. Those should be marked with IMMUTABLE, > STABLE of VOLATILE, depending on how they're influenced by global state > changes. Read the documentation on these flags. > > Finally, don't assume that ORDER BY inside a view influences the output > order of queries using the view. (Except in the trivial case of > "SELECT * FROM view"). As a rule of thumb, ORDER BY in a view only makes > sense if there's also a LIMIT clause. You are then guaranteed that the > view returns the first rows according to the specified order. > Don't assume they're necessarily returned in ascending order, though. > > best regards, > Florian Pflug > Thank you very much ! I will try to follow your advices. Thomas
Re: [HACKERS] LIMITing number of results in a VIEW with global variables
Hello, Thank you for your answer Robert. > Well, SQL, our our dialect of it anyway, doesn't have global > variables. So I think the above is going to throw a syntax error. > You may have global variables in your C code, but those won't be > visible from the SQL level. I was wrong in the definition of filtered_employees view. The correct one is : CREATE OR REPLACE VIEW filtered_employees AS SELECT *, get_mu() as mu FROM employees ORDER BY mu DESC LIMIT get_k(); Note that the access to global C variables (K, MU and ALPHA) from SQL is working well with my definitions of get_k(), get_mu()... There is no syntax error here, however the view is not working because it has side effects on the value of MU. I have also tried two other alternatives to the LIMIT keyword but it doesn't work yet : 1) SQL instruction : RANK() OVER(ORDER BY get_mu()) as sqlf_rank ; 2) C instruction : SPI_exec(query, K) ... it leads to a segmentation fault. > In general, I think you'd be better off not relying on C global > variables either, I don't understand how I could avoid using global variables in some cases. For instance, I must store the float value $1 corresponding to a fuzzy predicate degree in the following fuzzy2bool cast operation : CREATE OR REPLACE FUNCTION fuzzy2bool(FLOAT) RETURNS BOOLEAN LANGUAGE SQL AS 'SELECT set_mu($1); -- Here $1 must be stored in MU for each record observed in the view; SELECT $1 > get_alpha()'; -- Then $1 is converted to Boolean according to ALPHA global value > and instead passing the values you need as function > arguments. Do you mean that I should define a function filter(table_name TEXT, k INTEGER, alpha FLOAT) ? Thanks again for your help, Thomas Girault
Re: [HACKERS] LIMITing number of results in a VIEW with global variables
On Oct14, 2011, at 16:43 , Thomas Girault wrote: > CREATE OR REPLACE FUNCTION fuzzy2bool(FLOAT) > RETURNS BOOLEAN LANGUAGE SQL AS 'SELECT set_mu($1);SELECT $1 > get_alpha()'; It seems dangerous for a cast to modify global state such a MU. The evaluation order of functions during query execute isn't always easy to guess, and may change depending on the execution plan. > With this implicit cast, the query > SELECT age, young(age) FROM set_alpha(0.1), employees WHERE young(age); > is equivalent to > SELECT age, young(age) FROM set_alpha(0.1), employees WHERE > fuzzy2bool(young(age)); Those set_alpha() calls seem equally dangerous. If this alpha is supposed to be a global parameter, why not set it *before* issuing the query? > I can sort the results in the view 'sorted_employees' according to > value MU of a fuzzy predicate thanks to fuzzy2bool cast function. > > CREATE OR REPLACE VIEW sorted_employees AS >SELECT *, get_mu() as mu >FROM employees >ORDER BY mu DESC; Are you aware that an ORDER BY clause in a VIEW is only going to work if you do "SELECT .. .FROM view". If the outer query is more complex than that, I wouldn't bet on the results coming back in the expected order. Usually, you'd attach ORDER BY to the outermost query (or to subqueries containing a LIMIT clause). > The following query > SELECT age, mu FROM set_k(5), set_alpha(0.1), filtered_employees Again, you seem to rely on these set_k(), set_alpha() calls being done before the query is executed. But postgres doesn't guarantee that. > WHERE young(age); > gives the results : > age mu > 24 1 > 16 1 > instead of : > age mu > 16 1 > 21 0.89976158142 > 24 0.60023841858 > 26 0.40005960464 > 26 0.40005960464 > > It seems that the 'LIMIT K' instruction have side effects on the MU value. The execution plan may very well vary depending on the limit. Off-hand, I'd guess that with a small K, one of these set_whatever() calls in one of your FROM clauses gets executed after the computation it'd supposed to affect has already happened. > Why is it not working ? How to fix this issue ? Don't rely on the execution order of function calls in a SELECT statement. Divide your functions into two classes. The ones which have side-effects (i.e. change global state). These should *never* be called from SQL statements, except in the trivial case of "SELECT my_func(...);". Also, they should be marked with VOLATILE And the ones without side effects. Those should be marked with IMMUTABLE, STABLE of VOLATILE, depending on how they're influenced by global state changes. Read the documentation on these flags. Finally, don't assume that ORDER BY inside a view influences the output order of queries using the view. (Except in the trivial case of "SELECT * FROM view"). As a rule of thumb, ORDER BY in a view only makes sense if there's also a LIMIT clause. You are then guaranteed that the view returns the first rows according to the specified order. Don't assume they're necessarily returned in ascending order, though. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Call stacks and RAISE INFO
2011/10/15 Robert Haas : > On Sat, Oct 15, 2011 at 12:24 AM, Pavel Stehule > wrote: >> I have no problem with this. A context can be false for info and true >> for other in default. Please, use a different identifier than >> "context", that can be use for reading context in future - maybe >> "attach_context" or some similar. > > error_context? what about show_context, hide_context, hold_context, use_context ?? > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [GENERAL][HACKERS] register creation date of table
On Fri, Oct 14, 2011 at 6:20 AM, Willy-Bas Loos wrote: > 1. I think that there is no such information in the system tables. is > that correct? Yes. It's been discussed before but some people (particularly, Tom, IIRC) are not convinced that it's useful enough to justify its existence. > 2. i would like to go back in time. I think that i will just look up > the creation date for the files in the data directory and translate > their oid's to the object names and then update their dates. This > would of course only work from the last restore. Is that a good way to > do it? Well, that timestamp will get bumped on TRUNCATE, CLUSTER, VACUUM FULL, and rewriting versions of ALTER TABLE. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Call stacks and RAISE INFO
On Sat, Oct 15, 2011 at 12:24 AM, Pavel Stehule wrote: > I have no problem with this. A context can be false for info and true > for other in default. Please, use a different identifier than > "context", that can be use for reading context in future - maybe > "attach_context" or some similar. error_context? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LIMITing number of results in a VIEW with global variables
On Fri, Oct 14, 2011 at 10:43 AM, Thomas Girault wrote: > I am now trying to limit the number of results in the view according > to the global value K : > > CREATE OR REPLACE VIEW filtered_employees AS > SELECT *, get_mu() as mu > FROM employees > ORDER BY mu DESC > LIMIT K; Well, SQL, our our dialect of it anyway, doesn't have global variables. So I think the above is going to throw a syntax error. You may have global variables in your C code, but those won't be visible from the SQL level. In general, I think you'd be better off not relying on C global variables either, and instead passing the values you need as function arguments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: collect frequency statistics for arrays
> Looking now, I see that Alexander wasn't Cc'd on the review, so it's > possible he missed the message? > We've corresponded off list and have discussed my review at some length. Alex submitted an updated patch on Sep 22 to me personally ( although not to the list? Alex? ), with the promise of a new version with improved comments. Best, Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in walsender when calling out to do_pg_stop_backup (and others?)
On Oct11, 2011, at 09:21 , Magnus Hagander wrote: > On Tue, Oct 11, 2011 at 03:29, Florian Pflug wrote: >> On Oct10, 2011, at 21:25 , Magnus Hagander wrote: >>> On Thu, Oct 6, 2011 at 23:46, Florian Pflug wrote: It'd be nice to generally terminate a backend if the client vanishes, but so far I haven't had any bright ideas. Using FASYNC and F_SETOWN unfortunately sends a signal *everytime* the fd becomes readable or writeable, not only on EOF. Doing select() in CHECK_FOR_INTERRUPTS seems far too expensive. We could make the postmaster keep the fd's of around even after forking a backend, and make it watch for broken connections using select(). But with a large max_backends settings, we'd risk running out of fds in the postmaster... >>> >>> Ugh. Yeah. But at least catching it and terminating it when we *do* >>> notice it's down would certainly make sense... >> >> I'll try to put together a patch that sets a flag if we discover a broken >> connection in pq_flush, and tests that flag in CHECK_FOR_INTERRUPTS. Unless >> you >> wanna, of course. > > Please do, I won't have time to even think about it until after > pgconf.eu anyway ;) Ok, here's a first cut. I've based this on how query cancellation due to recovery conflicts work - internal_flush() sets QueryCancelPending and ClientConnectionLostPending. If QueryCancelPending is set, CHECK_FOR_INTERRUPTS checks ClientConnectionLostPending, and if it's set it does ereport(FATAL). I've only done light testing so far - basically the only case I've tested is killing pg_basebackup while it's waiting for all required WAL to be archived. best regards, Florian Pflug pg.discon_cancel.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers