Re: [HACKERS] Documentation: GiST extension implementation
Le 12 juin 09 à 23:20, Tom Lane a écrit : Dimitri Fontaine dfonta...@hi-media.com writes: Le 12 juin 09 à 21:49, Tom Lane a écrit : It seems to me it could still do with a lot more detail to specify what API the functions are really expected to implement. What's bothering me is the fuzziness of the API specifications for the support functions. It's not real clear for example what you have to do to have an index storage type different from the column datatype, and even less clear which type the same() function is comparing. Having some skeletons that execute magic bits of undocumented code is not a substitute for a specification. Oh yes that wasn't easy to guess: I had to look at others implementations then do some tests (trialerror) to determine this. Andrew Gierth has been really helpful here, and his ip4r module a good example (but without varlena). I'll try to provide something here, what I'm trying to say is that I need some help and research (and core code reading) to reverse engineer the specs. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Using results from DELETE ... RETURNING
On Sun, Jun 07, 2009 at 12:29:56AM -0400, Tom Lane wrote: David Fetter da...@fetter.org writes: Would it be super-complicated to do this with CTEs for 8.5? They seem to have sane properties like getting executed exactly once. Hmm, interesting thought. The knock against doing RETURNING as an ordinary subquery is exactly that you can't disentangle it very well from the upper query (and thus, it's hard to figure out when to fire triggers, to take just one problem). But we've defined CTEs much more restrictively, so maybe the problems can be solved in that context. I was discussing this with Andrew Gierth in IRC, who thought that putting RETURNING inside the WITH clause would be relatively easy, at least for the parser and planner. For the executor, he suggested that one approach might be to make INSERT, UPDATE and DELETE into their own nodes. Comments? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] char() overhead on read-only workloads not so insignifcant as the docs claim it is...
I'm currently doing some benchmarking on a Nehalem box(http://www.kaltenbrunner.cc/blog/index.php?/archives/26-Benchmarking-8.4-Chapter-1Read-Only-workloads.html) with 8.4 and while investigating what looks like issues in pgbench I also noticed that using char() has more than a negligable overhead on some (very special) readonly(!) workloads. for example running sysbench in read-only mode against 8.4 results in a profile(for the full run) that looks similiar to: samples %symbol name 981690 11.0656 bcTruelen 3591834.0487 index_getnext 3111283.5070 AllocSetAlloc 2723303.0697 hash_search_with_hash_value 2581572.9099 LWLockAcquire 1956732.2056 _bt_compare 1903032.1451 slot_deform_tuple 1681011.8948 PostgresMain 1641911.8508 _bt_checkkeys 1261101.4215 FunctionCall2 1239651.3973 SearchCatCache 1206291.3597 LWLockRelease the default sysbench mode actually uses a number of different queries and the ones dealing with char() are actually only a small part of the full set of queries sent. The specific query is causing bcTruelen to show up in the profile is: SELECT c from sbtest where id between $1 and $2 order by c where the parameters are for example $1 = '5009559', $2 = '5009658' - ie ranges of 100. benchmarking only that query results in: samples %symbol name 2148182 23.5861 bcTruelen 3694634.0565 index_getnext 3627843.9832 AllocSetAlloc 2841983.1204 slot_deform_tuple 1852792.0343 _bt_checkkeys 1801191.9776 LWLockAcquire 1727331.8965 appendBinaryStringInfo 1441581.5828 internal_putbytes 1410401.5486 AllocSetFree 1380931.5162 printtup 1242551.3643 hash_search_with_hash_value 1170541.2852 heap_form_minimal_tuple at around 46000 queries/s changing the fault sysbench schema from: Table public.sbtest Column | Type | Modifiers ++- id | integer| not null default nextval('sbtest_id_seq'::regclass) k | integer| not null default 0 c | character(120) | not null default ''::bpchar pad| character(60) | not null default ''::bpchar Indexes: sbtest_pkey PRIMARY KEY, btree (id) k btree (k) to Table public.sbtest Column | Type| Modifiers +---+- id | integer | not null default nextval('sbtest_id_seq'::regclass) k | integer | not null default 0 c | character varying | not null default ''::character varying pad| character(60) | not null default ''::bpchar Indexes: sbtest_pkey PRIMARY KEY, btree (id) k btree (k) results in a near 50%(!) speedup in terms of tps to around 67000 queries/s. This is however an extreme case because the c column actually contains no data at all (except for an empty string). the profile for the changed testcase looks like: 4307975. index_getnext 3967504.8095 AllocSetAlloc 3455084.1883 slot_deform_tuple 2282222.7666 appendBinaryStringInfo 2277662.7610 _bt_checkkeys 1938182.3495 LWLockAcquire 1799252.1811 internal_putbytes 1688712.0471 printtup 1520261.8429 AllocSetFree 1463331.7739 heap_form_minimal_tuple 1443051.7493 FunctionCall2 1283201. hash_search_with_hash_value at the very least we should reconsider this part of our docs: There is no performance difference between these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. from http://www.postgresql.org/docs/8.4/static/datatype-character.html regards Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Saturday 13 June 2009 01:10:06 Robert Haas wrote: pgexplain, as it happens... I could post some samples of the output, but it seems like it might be just as well to let those who are curious try it for themselves. I'd rather get opinions from people who care enough to download test than from those who are just bikeshedding. :-) I recommend, however, that you think about writing a regression test for this, so the interfaces are explicit, and those tweaking them in the future know what they are dealing with. A couple of comments on the specifics of the output: For the JSON format: * Numbers should not be quoted. For the XML format: * Instead of pgexplain, use explain with an XML namespace declaration. The schema name is missing in either output format. I think that was supposed to be one of the features of this that the objects are unambiguously qualified. I'm not sure I like element names such as Node-Type, instead of say nodetype, which is more like HTML and DocBook. (Your way might be more like SOAP, I guess.) Also, the result type of an EXPLAIN (format xml) should be type xml, not text. In general, I like this direction very much. There will probably be more tweaks on the output format over time. It's not like the plain EXPLAIN hasn't been tweaked countless times. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgindent run coming
On Wednesday 10 June 2009 23:54:41 Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: I think it usually does that already ... Um, attached you will find a bunch of counterexamples. At a quick look, I'm not sure that any of these are in code that hasn't been edited since the 8.3 pgindent run. So what does that mean then? Surely pgindent doesn't keep track of what has been edited when? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgindent run coming
Peter Eisentraut wrote: On Wednesday 10 June 2009 23:54:41 Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: I think it usually does that already ... Um, attached you will find a bunch of counterexamples. At a quick look, I'm not sure that any of these are in code that hasn't been edited since the 8.3 pgindent run. So what does that mean then? Surely pgindent doesn't keep track of what has been edited when? If the code has been edited since the last pgindent run, then pgindent hasn't had a chance to adjust it, no? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgindent run coming
Andrew Dunstan and...@dunslane.net writes: Peter Eisentraut wrote: On Wednesday 10 June 2009 23:54:41 Tom Lane wrote: At a quick look, I'm not sure that any of these are in code that hasn't been edited since the 8.3 pgindent run. So what does that mean then? Surely pgindent doesn't keep track of what has been edited when? If the code has been edited since the last pgindent run, then pgindent hasn't had a chance to adjust it, no? Right. Those extra spaces all represent manual editing sloppiness. I have not done a complete check, but I looked at the first couple of examples you cited and verified that pgindent did remove those spaces. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some of the datatypes only support hashing, while others only support sorting
Robert Haas robertmh...@gmail.com writes: This errdetail doesn't seem quite right in the following situation: rhaas=# select distinct proacl from pg_proc; ERROR: could not implement DISTINCT DETAIL: Some of the datatypes only support hashing, while others only support sorting. Hmm, interesting case. The reason the planner is assuming that that must be the failure mode is that the parser is not supposed to let through a DISTINCT request for a datatype that can't be either sorted or hashed. proacl is of course of aclitem[], and type aclitem has a hashable equality operator but no sort operator. Which causes get_sort_group_operators() to assume that aclitem[] can likewise be hashed but not sorted. However, there is no hash opclass for anyarray, so actually it's not hashable either; and the test the planner uses discovers that. It seems like we ought to add opclass entries and an anyarray hash function, but of course it's too late for that for 8.4. What I'll do for the moment is kluge up get_sort_group_operators() to reflect the fact that arrays are only sortable and not hashable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Developer meeting minutes up
2009/6/7 Tom Lane t...@sss.pgh.pa.us: So there are a lot of good reasons to work backwards in patching. I don't believe that these would be outweighed by some advantage in the mechanics of applying an unchanging patch to multiple branches (especially since AFAICT the mechanical advantage would be pretty darn minimal anyhow). As another data point, the stable branches of the linux kernel are actually maintained this way. There is a policy that any patch for the stable branches must have already be included (in some form) in HEAD. There is no merging going on. They aren't even using git cherry-pick, but that's because all backpatching goes into a review list rather than happening immediately. The multiple branches and merging that is going on in the linux kernel is all about development of new features, not fixing of bugs. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Suppressing occasional failures in copy2 regression test
Every so often the buildfarm shows row-ordering differences in the copy2 test, for example http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=jaguardt=2009-06-13%2003:00:02 (jaguar seems particularly prone to this for some reason, but other members have shown it too.) I believe what is happening is that autovacuum chances to trigger on the table being used, allowing some of the updated rows to be placed in positions they're not normally placed in. There is a simple fix for that: change the table to be a temp table, thus preventing autovac from touching it. Any objections to doing that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suppressing occasional failures in copy2 regression test
Sorry for top-posting -- stupid apple mail client... I'm not sure about that. It seems like race conditions with autovacuum are a real potential bug that it would be nice to be testing for. Another solution would be adding an order by clause - effectively trading coverage of unordered raw scans for coverage of the vacuum races. Or a third option would be adding alternate outputs for each ordering we observe. I suspect there aren't that many for serial tests but I'm less confident of that for the parallel tests. -- Greg On 13 Jun 2009, at 17:27, Tom Lane t...@sss.pgh.pa.us wrote: Every so often the buildfarm shows row-ordering differences in the copy2 test, for example http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=jaguardt=2009-06-13%2003:00:02 (jaguar seems particularly prone to this for some reason, but other members have shown it too.) I believe what is happening is that autovacuum chances to trigger on the table being used, allowing some of the updated rows to be placed in positions they're not normally placed in. There is a simple fix for that: change the table to be a temp table, thus preventing autovac from touching it. Any objections to doing that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suppressing occasional failures in copy2 regression test
Greg Stark greg.st...@enterprisedb.com writes: I'm not sure about that. It seems like race conditions with autovacuum are a real potential bug that it would be nice to be testing for. It's not a bug; it's a limitation of our testing framework that it sees this as a failure. Serious testing for autovac race conditions would indeed be interesting, but you're never going to get anything meaningful in that direction out of the current framework. Another solution would be adding an order by clause - effectively trading coverage of unordered raw scans for coverage of the vacuum races. And destroying one of the main points of the copy2 test, which is that those triggers are supposed to fire in a specific order. Or a third option would be adding alternate outputs for each ordering we observe. I suspect there aren't that many for serial tests but I'm less confident of that for the parallel tests. There are several variants already observed, I believe, and I have little confidence that there aren't more. In any case, that's a kluge not a solution, and it still degrades the ability of the test to cover what it was designed to cover. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suppressing occasional failures in copy2 regression test
On Sat, Jun 13, 2009 at 2:48 PM, Tom Lanet...@sss.pgh.pa.us wrote: Greg Stark greg.st...@enterprisedb.com writes: I'm not sure about that. It seems like race conditions with autovacuum are a real potential bug that it would be nice to be testing for. It's not a bug; it's a limitation of our testing framework that it sees this as a failure. Serious testing for autovac race conditions would indeed be interesting, but you're never going to get anything meaningful in that direction out of the current framework. The elephant in the room here may be moving to some more flexible/powerful testing framework, but the difficulty will almost certainly be in agreeing what it should look like. The actual writing of said test framework will take some work too, but to some degree that's a SMOP. This tuple-ordering issue seems to be one that comes up over and over again, but in the short term, making it a TEMP table seems like a reasonable fix. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
--On 13. Juni 2009 15:01:43 -0400 Robert Haas robertmh...@gmail.com wrote: Also, the result type of an EXPLAIN (format xml) should be type xml, not text. Seems reasonable. I'll see if I can figure out how to do that. I suppose it's okay then, that the format is not available when the server isn't build with --with-libxml ? -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Bernd Helmle maili...@oopsware.de writes: --On 13. Juni 2009 15:01:43 -0400 Robert Haas robertmh...@gmail.com wrote: Also, the result type of an EXPLAIN (format xml) should be type xml, not text. Seems reasonable. I'll see if I can figure out how to do that. I suppose it's okay then, that the format is not available when the server isn't build with --with-libxml ? I believe we have things set up so that you can still print xml data without libxml configured in. We'd need to be sure casting to text works too, but other than that I don't see an issue here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Sat, Jun 13, 2009 at 6:40 PM, Tom Lanet...@sss.pgh.pa.us wrote: Bernd Helmle maili...@oopsware.de writes: --On 13. Juni 2009 15:01:43 -0400 Robert Haas robertmh...@gmail.com wrote: Also, the result type of an EXPLAIN (format xml) should be type xml, not text. Seems reasonable. I'll see if I can figure out how to do that. I suppose it's okay then, that the format is not available when the server isn't build with --with-libxml ? I believe we have things set up so that you can still print xml data without libxml configured in. We'd need to be sure casting to text works too, but other than that I don't see an issue here. Hmm, I just tried to do this by modifying ExplainResultDesc to use XMLOID rather than TEXTOID when stmt-format == EXPLAIN_FORMAT_XML, and sure enough, explain (format xml) ... fails when --with-libxml is not specified. But maybe that's not the right way to do it - now that I think about it, using that in combination with do_text_output_multiline() seems totally wrong even if we end up deciding not to worry about the output type, since while there are multiple rows when the output is considered as text, there is surely only one row when you look at the whole thing as an XML document. I'm not too sure how to do this though. Help? In any event, considering that EXPLAIN is a utility statement and can't be embedded within a query, I'm not sure what benefit we get out of returning the data as XML rather than text. This doesn't seem likely to change either, based on Tom's comments here. http://archives.postgresql.org/pgsql-hackers/2009-05/msg00969.php ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Robert Haas robertmh...@gmail.com writes: In any event, considering that EXPLAIN is a utility statement and can't be embedded within a query, I'm not sure what benefit we get out of returning the data as XML rather than text. This doesn't seem likely to change either, based on Tom's comments here. http://archives.postgresql.org/pgsql-hackers/2009-05/msg00969.php I think you misinterpreted the point of that example, which is that there already is a way to get the output of EXPLAIN into the system for further processing. Were this not so, we wouldn't be worrying at all what data type it claims to have. But since there is a way, it's important what data type it produces. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Sat, Jun 13, 2009 at 7:42 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: In any event, considering that EXPLAIN is a utility statement and can't be embedded within a query, I'm not sure what benefit we get out of returning the data as XML rather than text. This doesn't seem likely to change either, based on Tom's comments here. http://archives.postgresql.org/pgsql-hackers/2009-05/msg00969.php I think you misinterpreted the point of that example, which is that there already is a way to get the output of EXPLAIN into the system for further processing. Were this not so, we wouldn't be worrying at all what data type it claims to have. But since there is a way, it's important what data type it produces. Well, if you get the EXPLAIN output into the system by defining a wrapper function, said wrapper function will return the type that it's defined to return, regardless of what EXPLAIN itself returns, no? I don't have a problem making it return XML; I'm just not exactly sure how to do it. Is it possible to get that working without depending on libxml? How? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Saturday 13 June 2009 22:01:43 Robert Haas wrote: * Instead of pgexplain, use explain with an XML namespace declaration. Could you specify this a bit further, like write out exactly what you want it to look like? My XML-fu is not very strong. Just replace your pgexplain by explain xmlns=http://www.postgresql.org/2009/explain; The actual URI doesn't matter, as long as it is distinguishing. The value I chose here follows conventions used by W3C. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Saturday 13 June 2009 22:01:43 Robert Haas wrote: I recommend, however, that you think about writing a regression test for this, so the interfaces are explicit, and those tweaking them in the future know what they are dealing with. I would like to have something in this area, but Tom didn't think it was workable. http://archives.postgresql.org/message-id/603c8f070904151623ne07d744k615edd 4aa669a...@mail.gmail.com Currently, we don't even have something trivial like EXPLAIN SELECT 1 in the regression tests, so even if you completely break EXPLAIN so that it core dumps (voice of experience speaking here) make check still passes with flying colors. That post described a scenario where you check whether given a data set and ANALYZE, the optimizer produces a certain plan. I agree that that might be tricky. A regression test for EXPLAIN, however, should primarily check whether the output format is stable. We are planning to offer this as a public interface, after all. You could use faked up statistics and all but one or two plan types turned off, and then the results should be pretty stable. Unless the fundamental cost model changes, but it doesn't do that very often for the simpler plan types anyway. Things to check for would be checking whether all the fields are there, quoted and escaped correctly, and what happens if statistics are missing or corrupted, etc. Or whether you get any output at all, as you say. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Sunday 14 June 2009 07:27:19 Robert Haas wrote: On Sat, Jun 13, 2009 at 7:42 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: In any event, considering that EXPLAIN is a utility statement and can't be embedded within a query, I'm not sure what benefit we get out of returning the data as XML rather than text. This doesn't seem likely to change either, based on Tom's comments here. http://archives.postgresql.org/pgsql-hackers/2009-05/msg00969.php I think you misinterpreted the point of that example, which is that there already is a way to get the output of EXPLAIN into the system for further processing. Were this not so, we wouldn't be worrying at all what data type it claims to have. But since there is a way, it's important what data type it produces. Well, if you get the EXPLAIN output into the system by defining a wrapper function, said wrapper function will return the type that it's defined to return, regardless of what EXPLAIN itself returns, no? I don't have a problem making it return XML; I'm just not exactly sure how to do it. Is it possible to get that working without depending on libxml? How? Even if this doesn't end up being feasible, I feel it's important that the XML and JSON formats return one datum, not one per line. Otherwise a client that wants to do some processing on the result will have to do about three extra steps to get the result usable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers