Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Sun, 2007-11-25 at 19:35 -0500, Tom Lane wrote: The cost of resolving ambiguous operators has been an issue for a long time, of course, but it seems particularly bad in this case --- gprof blames 37% of the runtime on oper_select_candidate(). It might be time to think about caching the results of operator searches somehow. Too late for 8.3 though. Wow: 37%. varchar_column = const is a very, very common predicate. 37% is enough to still be visible for a wide range of queries, not just the very simple. I think queries with WHERE clauses like (int4_column = int4_const AND varchar_column = const) will also be noticeably affected this. So even when we have integer keys, we will still get slowed down by an checks to an additional status column. Caching is the right way around this, though as you point out, that is not an option for 8.3. But I think there must be an action that we can take for 8.3 and that much runtime should not be given away easily. ISTM that we can win back the losses Guillaume has identified, plus gain a little more even. Can we just hard-code the varchar lookup? Ugly, but it will add almost nothing to non-varchar paths and yet speed-up the varchar lookup dramatically. I guess the objection to that will be that it prevents people from overloading the = operator for varchars to change the selectivity functions etc. So how about we have a cache-of-one: we store the best varchar = operator after the first lookup, then document that if people overload this then they must reconnect. That's an acceptable pain for the few people affected and a great benefit for the most people. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] plpgsql: another new reserved word
On 11/10/07, Tom Lane [EMAIL PROTECTED] wrote: The current plpgsql code seems to be designed to force a qualifier to be interpreted as a block label if at all possible, even if there are more-closely-nested alternative interpretations; so in the above example it would assign to the outer variable bar. This seems a tad bogus to me. Can anyone comment on how Oracle handles cases like this? Some googling brought following link: http://download-uk.oracle.com/docs/cd/B14117_01/appdev.101/b10807/d_names.htm I have not parsed it completely, but rule seems simple - inner scope overrides outer one and no magic on unqualified idents, if ident is unqualified, it wont be matched to schema, block or some other qualifier. (well, at least no such magic behaviour is mentioned.) -- marko ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Autovacuum and OldestXmin
On Thu, 2007-11-22 at 21:59 -0500, Christopher Browne wrote: The world rejoiced as [EMAIL PROTECTED] (Alvaro Herrera) wrote: Simon Riggs wrote: I notice that slony records the oldestxmin that was running when it last ran a VACUUM on its tables. This allows slony to avoid running a VACUUM when it would be clearly pointless to do so. AFAICS autovacuum does not do this, or did I miss that? Hmm, I think it's just because nobody suggested it and I didn't came up with the idea. Whether it's a useful thing to do is a different matter. Why store it per table and not more widely? Perhaps per database would be just as useful; and maybe it would allow us to skip running autovac workers when there is no point in doing so. I think I need to take blame for that feature in Slony-I ;-). Good thinking. I imagine it might be useful to add it to autovac, too. I thought it was pretty neat that this could be successfully handled by comparison with a single value (e.g. - eldest xmin), and I expect that using a single quasi-global value should be good enough for autovac. I've just looked at that to see if it is that easy; I don't think it is. That works for slony currently because we vacuum all of the slony tables at once. Autovacuum does individual tables so we'd need to store the individual values otherwise we might skip doing a VACUUM when it could have done some useful work. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Am Samstag, 24. November 2007 schrieb Simon Riggs: In many cases, 100% of queries are unprepared. I have seen many applications where prepared queries caused stale plans and poor performance. We have in many cases achieved great performance gains by turning off prepared queries globally, for example in the driver layer. It had once gotten to a point where uses prepared statements was on par with never touched postgresql.conf and never heard of VACUUM as the worst performance sins. I will gladly revisit this with 8.3, but I am not very optimistic. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 26, 2007 11:59 AM, Peter Eisentraut [EMAIL PROTECTED] wrote: I have seen many applications where prepared queries caused stale plans and poor performance. We have in many cases achieved great performance gains by turning off prepared queries globally, for example in the driver layer. It had once gotten to a point where uses prepared statements was on par with never touched postgresql.conf and never heard of VACUUM as the worst performance sins. I will gladly revisit this with 8.3, but I am not very optimistic. +1 on this point as I explained it earlier. -- Guillaume ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Autovacuum and OldestXmin
Simon Riggs wrote: On Thu, 2007-11-22 at 21:59 -0500, Christopher Browne wrote: I imagine it might be useful to add it to autovac, too. I thought it was pretty neat that this could be successfully handled by comparison with a single value (e.g. - eldest xmin), and I expect that using a single quasi-global value should be good enough for autovac. I've just looked at that to see if it is that easy; I don't think it is. That works for slony currently because we vacuum all of the slony tables at once. Autovacuum does individual tables so we'd need to store the individual values otherwise we might skip doing a VACUUM when it could have done some useful work. Yeah, that was my conclusion too. -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC Voy a acabar con todos los humanos / con los humanos yo acabaré voy a acabar con todos / con todos los humanos acabaré (Bender) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Replacement Selection
Hi to all. I'm new. I'd like to integrate my code into PostgreSQL. It's the implementation of some refinements of Replacement Selection algorithm used for External Sorting. I have got some issue and preferibly I'd like to be supported by some developers that have something to do with it. Who can I talk to? Thanks for your attentions. Good Luck! Manolo. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [COMMITTERS] pgsql: update files for beta3
Am Samstag, 24. November 2007 schrieb Bruce Momjian: Peter, were you going to address this? It's done now. diff -ur ../cvs-pgsql/configure.in ./configure.in --- ../cvs-pgsql/configure.in 2007-11-16 21:25:10.0 +0100 +++ ./configure.in 2007-11-16 22:27:36.0 +0100 @@ -19,7 +19,7 @@ AC_INIT([PostgreSQL], [8.3beta3], [EMAIL PROTECTED]) -AC_PREREQ(2.59) +m4_if(m4_defn([m4_PACKAGE_VERSION]), [2.59], [], [m4_fatal([Autoconf version 2.59 is required])]) AC_COPYRIGHT([Copyright (c) 1996-2007, PostgreSQL Global Development Group]) AC_CONFIG_SRCDIR([src/backend/access/common/heaptuple.c]) AC_CONFIG_AUX_DIR(config) -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: I'm new. I'd like to integrate my code into PostgreSQL. It's the implementation of some refinements of Replacement Selection algorithm used for External Sorting. I have got some issue and preferibly I'd like to be supported by some developers that have something to do with it. Who can I talk to? This mailing list is the right place to discuss that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] possible to create multivalued index from xpath() results in 8.3?
Am Freitag, 23. November 2007 schrieb Tom Lane: Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane wrote: Should we force initdb to correct these pg_proc entries, or just quietly change pg_proc.h? Considering the extent of the changes, I'd be in favor of forcing an initdb. Well, if you're going to change the contents of pg_cast then there is little choice. I was considering something less invasive ... I will hang on to this patch for a few more days to see if any invasive catalog changes come out of the quote_literal/set_config discussion. If not, I'll consider a less invasive solution. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Replacement Selection
Thanks for your support. I downloaded the source code of the last stable version of PostgreSQL. Where can I find the part related to the External Sorting algorithm (supposed to be Replacement Selection)? I mean, which is the file to be studied and/or modified and/or substituted? Thanks for your attention. -- From: Heikki Linnakangas [EMAIL PROTECTED] Sent: Monday, November 26, 2007 1:35 PM To: [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection [EMAIL PROTECTED] wrote: I'm new. I'd like to integrate my code into PostgreSQL. It's the implementation of some refinements of Replacement Selection algorithm used for External Sorting. I have got some issue and preferibly I'd like to be supported by some developers that have something to do with it. Who can I talk to? This mailing list is the right place to discuss that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: Thanks for your support. I downloaded the source code of the last stable version of PostgreSQL. Where can I find the part related to the External Sorting algorithm (supposed to be Replacement Selection)? I mean, which is the file to be studied and/or modified and/or substituted? src/backend/utils/sort/tuplesort.c -- Alvaro Herrera Developer, http://www.PostgreSQL.org/ I would rather have GNU than GNOT. (ccchips, lwn.net/Articles/37595/) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: I downloaded the source code of the last stable version of PostgreSQL. Where can I find the part related to the External Sorting algorithm (supposed to be Replacement Selection)? I mean, which is the file to be studied and/or modified and/or substituted? In src/backend/utils/sort/tuplesort.c. The comments at the top of that file is a good place to start. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Locating sharedir in PostgreSQL on Windows
Hi everyone, I'm working on a piece of code for PostGIS to allow the loading of projection configuration files from the share/postgresql directory, but I can't find a way of getting this to work under Win32. AIUI the way to do this would be to use a combination of my_exec_path and get_share_path in order to find the directory, but MingW refuses to find the my_exec_path variable during linking. Unfortunately I suspect that this because my_exec_path is not declared as DLLIMPORT in backend/utils/init/globals.c :( I really need to find a solution to this that can work all the way back to PostgreSQL 8.0 - can anyone think of any better ideas? Many thanks, Mark. -- ILande - Open Source Consultancy http://www.ilande.co.uk ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Simon Riggs [EMAIL PROTECTED] writes: But I think there must be an action that we can take for 8.3 and that much runtime should not be given away easily. ISTM that we can win back the losses Guillaume has identified, plus gain a little more even. Perhaps some sanity could be restored to this discussion by pointing out that the 2007-01-01 code *also* clocks in at 37% spent in oper_select_candidate. IOW it's been like this for a very long time. I'm not interested in destabilizing 8.3 with panicky last-minute patches. So how about we have a cache-of-one: Cache-of-one has exactly the same difficulty as cache-of-many, other than the table lookup itself, which is a solved problem (hashtable). You still have to determine how you identify the cached value and what events require a cache flush. Nor do I see any particular reason to assume that a cache of only one operator would be of any use for real-world apps, as opposed to toy examples. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
initdb for RC1 (was Re: [HACKERS] [GENERAL] possible to create multivalued index from xpath() results in 8.3?)
Peter Eisentraut [EMAIL PROTECTED] writes: Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane wrote: Well, if you're going to change the contents of pg_cast then there is little choice. I was considering something less invasive ... I will hang on to this patch for a few more days to see if any invasive catalog changes come out of the quote_literal/set_config discussion. If not, I'll consider a less invasive solution. ATM it seems that consensus is to change quote_literal, so we may as well adopt the cleaner solution for fixing the xml functions too. Anyone out there who wants to argue against forcing initdb for RC1? If so, better speak up now. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] plpgsql: another new reserved word
Marko Kreen [EMAIL PROTECTED] writes: On 11/10/07, Tom Lane [EMAIL PROTECTED] wrote: Can anyone comment on how Oracle handles cases like this? Some googling brought following link: http://download-uk.oracle.com/docs/cd/B14117_01/appdev.101/b10807/d_names.htm Hmm, interesting document. I think the bit that is relevant for us is the statement : An outer capture occurs when a name in an inner scope, which once : resolved to an entity in an inner scope, is resolved to an entity in an : outer scope. SQL and PL/SQL are designed to prevent outer captures. You : do not need to take any action to avoid this condition. AFAICT this means that if there is any ambiguity, the most closely nested possible interpretation will always win. Therefore the current behavior of plpgsql is indeed wrong, because it searches up the namespace stack for block labels too soon. I'll go see about changing that. The subsection Qualifying References to Object Attributes and Methods describes some truly bletcherous behavior --- you can't reference a complex-column component unless you start from a table alias? How non-orthogonal. I prefer our current solution of making you parenthesize the column reference before you access its component ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] proposal, plpgsql, 8.4, for record in cursor
Hello I propose new kind of FOR statement .. iteration over cursor, There are two reasons: a) better readability of procedure; - SQL statement is outside of statement, b) better conformance with PL/SQL. Sample: CREATE OR REPLACE FUNCTION foo() RETURNS void AS $$ DECLARE c CURSOR(p integer) FOR SELECT * FROM foot WHERE a = p; BEGIN FOR r IN c(10) LOOP RAISE NOTICE '%', r.a; END LOOP; RETURN; END; $$ FUNCTION plpgsql; Syntax: FOR recordvar IN cursor[ ( params ) ] LOOP .. END LOOP; Notes: recordvar is local in statement body (like fori). Used cursor is automatically opened and closed. Lot of code can be shared with fors and dynafors (some with stmt_open). Regards Pavel Stehule ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Replacement Selection
Ok guys! Thanks for your help. Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? Thanks for your time! -- From: Heikki Linnakangas [EMAIL PROTECTED] Sent: Monday, November 26, 2007 2:34 PM To: [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection [EMAIL PROTECTED] wrote: I downloaded the source code of the last stable version of PostgreSQL. Where can I find the part related to the External Sorting algorithm (supposed to be Replacement Selection)? I mean, which is the file to be studied and/or modified and/or substituted? In src/backend/utils/sort/tuplesort.c. The comments at the top of that file is a good place to start. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] proposal, plpgsql, 8.4, for record in cursor
Pavel Stehule [EMAIL PROTECTED] writes: I propose new kind of FOR statement .. iteration over cursor, This seems useless and probably syntactically ambiguous. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] proposal, plpgsql, 8.4, for record in cursor
Am Montag, 26. November 2007 schrieb Tom Lane: Pavel Stehule [EMAIL PROTECTED] writes: I propose new kind of FOR statement .. iteration over cursor, This seems useless and probably syntactically ambiguous. I think that is isomorphic to what he mentioned as better conformance with PL/SQL. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
Am Montag, 26. November 2007 schrieb Mark Cave-Ayland: I'm working on a piece of code for PostGIS to allow the loading of projection configuration files from the share/postgresql directory, but The share directory is the wrong place for configuration files anyway. And moreover, non-PostgreSQL packages have no business putting files into PostgreSQL's private directories. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? You haven't given any details on what you're trying to do. What are you trying to do? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: Ok guys! Thanks for your help. Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? I think you should print the file and read it several times until you understand what's going on. Then you can start thinking where and how to modify it. -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J Oh, great altar of passive entertainment, bestow upon me thy discordant images at such speed as to render linear thought impossible (Calvin a la TV) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 26. November 2007 schrieb Mark Cave-Ayland: I'm working on a piece of code for PostGIS to allow the loading of projection configuration files from the share/postgresql directory, but The share directory is the wrong place for configuration files anyway. And moreover, non-PostgreSQL packages have no business putting files into PostgreSQL's private directories. That doesn't seem logical to me. We expect tsearch configuration files to be put into $sharedir/tsearch_data, so why shouldn't PostGIS use a similar approach to dealing with external configuration data? I'm not sure why Mark's having a problem accessing my_exec_path --- it *is* declared DLLIMPORT in miscadmin.h (which is where it counts, AIUI) clear back to 8.0. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Mon, 2007-11-26 at 09:55 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: But I think there must be an action that we can take for 8.3 and that much runtime should not be given away easily. ISTM that we can win back the losses Guillaume has identified, plus gain a little more even. Perhaps some sanity could be restored to this discussion by pointing out that the 2007-01-01 code *also* clocks in at 37% spent in oper_select_candidate. IOW it's been like this for a very long time. I'm not interested in destabilizing 8.3 with panicky last-minute patches. I think this is worth taking action on, IMHO. I've written up my suggestion as a 5 line patch, with measured performance improvement of about 40% for the varchar case. It isn't a great long term fix, but I don't want to delay 8.3 either with the full fix. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Replacement Selection
Alvaro Herrera [EMAIL PROTECTED] writes: [EMAIL PROTECTED] wrote: Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? I think you should print the file and read it several times until you understand what's going on. Then you can start thinking where and how to modify it. Also, go find a copy of Knuth volume 3, because a whole lot of the comments assume you've read Knuth's discussion of external sorting. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
Peter Eisentraut wrote: Am Montag, 26. November 2007 schrieb Mark Cave-Ayland: I'm working on a piece of code for PostGIS to allow the loading of projection configuration files from the share/postgresql directory, but The share directory is the wrong place for configuration files anyway. And moreover, non-PostgreSQL packages have no business putting files into PostgreSQL's private directories. How does that work with the stuff that goes into directories relative to $libdir (per documented recommendations), without creating easy-to-break paths like $libdir/../../MyAddon/8.3/MyAddon.dll? Regards, Dave ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] proposal, plpgsql, 8.4, for record in cursor
On 26/11/2007, Tom Lane [EMAIL PROTECTED] wrote: Pavel Stehule [EMAIL PROTECTED] writes: I propose new kind of FOR statement .. iteration over cursor, This seems useless and probably syntactically ambiguous. I don't see any syntactically problem and I have working prototype. This case is simply identified by cursor variable after IN keyword. It's possibility little bit clean open, fors, dynafors code. Regards Pavel Stehule regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 26, 2007 5:58 PM, Simon Riggs [EMAIL PROTECTED] wrote: I've written up my suggestion as a 5 line patch, with measured performance improvement of about 40% for the varchar case. It isn't a great long term fix, but I don't want to delay 8.3 either with the full fix. Can we see the patch? Thanks. -- Guillaume ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Fw: [HACKERS] Replacement Selection
Thanks for your advice. The developement of this integration is part of my final project. And fo course my initial bibliografy includes the Knuth reference as you can see 1. Vladimir Estivill-Castro and Derick Wood. A survey of adaptive sorting algorithms. ACM Computing Surveys, 24(4):441{476, 1992. 2. Donald E. Knuth. The art of computer programming, volume 3: sorting and searching. Addison-Wesley, Reading, 2nd edition, 1998. 3. P. Larson and G. Graefe. Memory management during run generation in external sorting. In ACM, editor, SIGMOD98, pages 472{483, 1998. 4. Per-Ake Larson. External sorting: Run formation revisited. IEEE Transactions on Knowledge and Data Engineering, 15(4):961{972, 2003. 5. Je®rey Scott Vitter and David A. Hutchinson. Distribution sort with randomized cycling. pages 77-86. -- From: Tom Lane [EMAIL PROTECTED] Sent: Monday, November 26, 2007 6:00 PM To: Alvaro Herrera [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection Alvaro Herrera [EMAIL PROTECTED] writes: [EMAIL PROTECTED] wrote: Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? I think you should print the file and read it several times until you understand what's going on. Then you can start thinking where and how to modify it. Also, go find a copy of Knuth volume 3, because a whole lot of the comments assume you've read Knuth's discussion of external sorting. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] maintenance_work_mem memory constraint?
While supporting a customer to increase recovery performance from its backups i just realized that PostgreSQL never uses big maintenance_work_mem settings. Even giving 10GB of RAM to maintenance_work_mem results in using a fraction of memory (it switches to external sort after using around 2 GB). I think the culprit ist the following code in tuplesort.c, grow_memtuples(), as the comments there let assume already: /* * On a 64-bit machine, allowedMem could be high enough to get us into * trouble with MaxAllocSize, too. */ if ((Size) (state-memtupsize * 2) = MaxAllocSize / sizeof(SortTuple)) return false; While i understand, that doubling the memtuples array is more efficient than increasing the array in smaller steps, i think we give away usable memory, because we never consider using memory up to the upper limit given by MaxAllocSize. Modifying the code in that way results in a sightly better memory usage, but far away from what the system is able to use on such a machine (see the diff attached, a very crude experimental code). I've played around with increasing the MaxAllocSize as well and got the backend to use up to 6GB maintenance_work_mem during creating an index with 80.000.000 integer tuples. That way the backend was able to sort the tuples entirely in memory, speeding up the creation of the index from 200s to 80s. I understand that we have to handle MaxAllocSize very carefully, since it's involved in many cases in the code. But isn't it worth to special case the code in grow_memtuples() (and maybe other places where sort is likely to use more RAM), so that we can remove this constraint on 64-Bit systems with many RAM built in? Or am I missing something very important?. -- Thanks Bernd*** tuplesort.c 2007-11-26 18:30:21.0 +0100 --- tuplesort.c.new 2007-11-26 18:30:11.0 +0100 *** *** 810,820 * On a 64-bit machine, allowedMem could be high enough to get us into * trouble with MaxAllocSize, too. */ if ((Size) (state-memtupsize * 2) = MaxAllocSize / sizeof(SortTuple)) ! return false; FREEMEM(state, GetMemoryChunkSpace(state-memtuples)); - state-memtupsize *= 2; state-memtuples = (SortTuple *) repalloc(state-memtuples, state-memtupsize * sizeof(SortTuple)); --- 810,837 * On a 64-bit machine, allowedMem could be high enough to get us into * trouble with MaxAllocSize, too. */ + + if ((Size) (state-memtupsize) + 1 = MaxAllocSize / sizeof(SortTuple)) + { + elog(DEBUG1, memtupsize %d reached MaxAllocSize %lu, + state-memtupsize, MaxAllocSize); + /* max alloc size already reached */ + return false; + } + if ((Size) (state-memtupsize * 2) = MaxAllocSize / sizeof(SortTuple)) ! { ! /* use the maximum allowed alloc size */ ! state-memtupsize = (MaxAllocSize / sizeof(SortTuple) - 1); ! elog(DEBUG1, memtupsize %d exceeds MaxAllocSize %lu, readjusted, ! state-memtupsize * 2, MaxAllocSize); ! } ! else ! { ! state-memtupsize *= 2; ! } FREEMEM(state, GetMemoryChunkSpace(state-memtuples)); state-memtuples = (SortTuple *) repalloc(state-memtuples, state-memtupsize * sizeof(SortTuple)); ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Simon Riggs [EMAIL PROTECTED] writes: I've written up my suggestion as a 5 line patch, with measured performance improvement of about 40% for the varchar case. A 5-line patch which improves performance by 40% for any case sounds amazing, but how fragile is that gain? The kind of thing which would be worryign is if runing a query which uses both varchar and some other ambiguous operator causes it to lose all its gain. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Gregory Stark [EMAIL PROTECTED] writes: A 5-line patch which improves performance by 40% for any case sounds amazing, but how fragile is that gain? The kind of thing which would be worryign is if runing a query which uses both varchar and some other ambiguous operator causes it to lose all its gain. Yeah, exactly. If we're going to risk anything like this at all, the cache-of-one restriction is simply not acceptable (especially given that the part of the coding it would eliminate is the simplest and easiest-to-get-right part). In the test case Guillame provided, every single WHERE clause happens to be of the form varchar_column = 'unknown-type literal' and there are no other operators used in the SELECT lists; but I can hardly believe that this is representative of any significant number of real-world applications. Even pgbench uses more than one operator. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] maintenance_work_mem memory constraint?
Bernd Helmle [EMAIL PROTECTED] writes: ... But isn't it worth to special case the code in grow_memtuples() (and maybe other places where sort is likely to use more RAM), so that we can remove this constraint on 64-Bit systems with many RAM built in? Or am I missing something very important?. AFAICS this patch can increase the number of sortable tuples by at most 2X (less one). That doesn't seem worth getting very worked up about ... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Replacement Selection
Sorry. I'm trying to integrate my code into PostgreSQL. At the moment I have got my working code, with my own main() etc etc. The code is supposed to perform run generation during external sorting. That's all, my code won't do any mergesort. Just run generation. I'm studing the code and I don't know where to put my code into. Which part I need to substitute and which other are absolutely untouchables. I admit I'm not an excellent programmer. I've always been writing my own codes, simple codes. Now I have got some ideas that can possibly help postgreSQL to get better. And for the first time I'm to integrate code into others code. I say it just to apologize in case some things that could be obvious for someone else, maybe are not for me. Anyway... back to work. My code has the following structure. 1) Generates a random input stream to sort. As for this part, i just generate an integer input stream, not a stream of db records. I talk about stream because I'm in a general case in which the input source can be unknown and we cannot even know how much elements to sort 2)Fill the available memory with the first M elements from stream. They will be arranged into an heap structure. 3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth algorithm) marks elements belonging to runs in otder to know which run they belong to and to know when the current heap has finished building the current run. I don't memorize this kind of info. I just output from heap to run all of the elements going into the current run. The elements supposed to go into the next run (I call them dead records) are still stored into main memory, but as leaves of the heap. This implies reducing the heap size and so heapifying a smaller number of elements each time I get a dead record (it's not necessary to sort dead records). When the heap size is zero a new run is created heapifying all the dead records currently present into main memory. I haven't seen something similar into tuplesort.c, apparently no heapify is called no new run created and stuff like this. Do you see any parallelism between PostgreSQL code with what I said in the previous points? Thanks for your attention. -- From: Heikki Linnakangas [EMAIL PROTECTED] Sent: Monday, November 26, 2007 5:42 PM To: [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection [EMAIL PROTECTED] wrote: Unfortunately I'm lost into the code... any good soul helping me to understand what should be the precise part to be modified? You haven't given any details on what you're trying to do. What are you trying to do? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] proposal, plpgsql, 8.4, for record in cursor
[EMAIL PROTECTED] (Peter Eisentraut) writes: Am Montag, 26. November 2007 schrieb Tom Lane: Pavel Stehule [EMAIL PROTECTED] writes: I propose new kind of FOR statement .. iteration over cursor, This seems useless and probably syntactically ambiguous. I think that is isomorphic to what he mentioned as better conformance with PL/SQL. Hmm. So better conformance with PL/SQL amounts to being useless and syntactically ambiguous... ;-) -- let name=cbbrowne and tld=linuxfinances.info in name ^ @ ^ tld;; http://linuxfinances.info/info/sgml.html HE'S AHAB, boy; and Ahab of old, thou knowest, was a crowned king! -- /Moby-Dick/, Ch 16 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] writes: 3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth algorithm) marks elements belonging to runs in otder to know which run they belong to and to know when the current heap has finished building the current run. I don't memorize this kind of info. I just output from heap to run all of the elements going into the current run. The elements supposed to go into the next run (I call them dead records) are still stored into main memory, but as leaves of the heap. This implies reducing the heap size and so heapifying a smaller number of elements each time I get a dead record (it's not necessary to sort dead records). When the heap size is zero a new run is created heapifying all the dead records currently present into main memory. Why would this be an improvement over Knuth? AFAICS you can't generate longer runs this way, and it's not saving any time --- in fact it's costing time, because re-heapifying adds a lot of new comparisons. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Replacement Selection
I must precise that it's not the improvement. Other more complex algorithms correspond to the refinements, but at the moment I just want to know which part of PostgreSQL code does what. I also implemented Replacement Selection (RS) so if I'm able to integrate my RS I hope I would be able to integrate the others too. Anyway, even in my RS implementation a longer run is created. The first M initialization elements will surely form part of the current run. M is the memory size so at least a run sized M will be created. After initialization, the elements are not suddenly output, but an element from heap is output into run as soon as I get an element from stream. In other words, for each element from stream, the root element of the heap is output, and the input element takes the root place into the heap. If that element is a good record I just heapify (since the element will be placed at the now free root place). If that input element is a dead record I swap it with the last leaf and reduce the heap size. -- From: Tom Lane [EMAIL PROTECTED] Sent: Monday, November 26, 2007 7:31 PM To: [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Replacement Selection [EMAIL PROTECTED] writes: 3) Start run generation. As for this phase, I see PostgreSQL code (as Knuth algorithm) marks elements belonging to runs in otder to know which run they belong to and to know when the current heap has finished building the current run. I don't memorize this kind of info. I just output from heap to run all of the elements going into the current run. The elements supposed to go into the next run (I call them dead records) are still stored into main memory, but as leaves of the heap. This implies reducing the heap size and so heapifying a smaller number of elements each time I get a dead record (it's not necessary to sort dead records). When the heap size is zero a new run is created heapifying all the dead records currently present into main memory. Why would this be an improvement over Knuth? AFAICS you can't generate longer runs this way, and it's not saving any time --- in fact it's costing time, because re-heapifying adds a lot of new comparisons. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 26, 2007 6:48 PM, Tom Lane [EMAIL PROTECTED] wrote: In the test case Guillame provided, every single WHERE clause happens to be of the form varchar_column = 'unknown-type literal' and there are no other operators used in the SELECT lists; but I can hardly believe that this is representative of any significant number of real-world applications. Even pgbench uses more than one operator. Sure. The application uses a lot of other operators (timestamp comparison, cube operators, LIKE, boolean comparison...). But as the primary keys are all varchar in this application, the operator varchar = unknown is by far the most used. If we can't find a solution for upstream 8.3, I'll be happy to apply a local patch for this customer (if I'm sure it doesn't do any harm in their case). Usually, people use integer flavoured primary keys so they probably won't be hit by this problem as strong as we are. -- Guillaume ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] wrote: I also implemented Replacement Selection (RS) so if I'm able to integrate my RS I hope I would be able to integrate the others too. The existing code implements RS. Tom asked you to describe what improvements you hope to make; I'm confident that he already understands how to implement RS. :-) ** Why don't you compile with TRACE_SORT enabled and watch the log output. The function in tuplesort.c that you should start with is puttuple_common(). in puttuple_common(), the transition from an internal to external sort is performed at the bottom of the TSS_INITIAL case in the main switch statement. The function dumptuples() heapifies the in-core tuples (divides the in-core tuples into initial runs and then advances the state to TSS_BUILDRUNS). All subsequent tuples will hit the TSS_BUILDRUNS case and will insert tuples into the heap; emitting tuples for the current run as it goes. I recommend you run the code in the debugger on a external-sorting query: watch two or three tuples go into the heap and you'll get the idea. The top of the heap is at state-memtuples[0] the heap goes down from there. New tuples are added there and the heap is adjusted (Using the tuplesort_heap_siftup() function). -Tim ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Mon, Nov 26, 2007 at 1:04 PM, in message [EMAIL PROTECTED], Guillaume Smet [EMAIL PROTECTED] wrote: On Nov 26, 2007 6:48 PM, Tom Lane [EMAIL PROTECTED] wrote: In the test case Guillame provided, every single WHERE clause happens to be of the form varchar_column = 'unknown-type literal' and there are no other operators used in the SELECT lists; but I can hardly believe that this is representative of any significant number of real-world applications. Even pgbench uses more than one operator. Sure. The application uses a lot of other operators (timestamp comparison, cube operators, LIKE, boolean comparison...). But as the primary keys are all varchar in this application, the operator varchar = unknown is by far the most used. If we can't find a solution for upstream 8.3, I'll be happy to apply a local patch for this customer (if I'm sure it doesn't do any harm in their case). Usually, people use integer flavoured primary keys so they probably won't be hit by this problem as strong as we are. I've been watching this with some concern, as we only use synthetic keys when there is no naturally occurring set of columns which uniquely identify a row. In our circuit court software (where we have the most users), there are 377 permanent tables, mostly with compound keys: keycolcount | count -+--- 1 |91 2 | 129 3 |88 4 |46 5 |20 6 | 3 (6 rows) We have a lot of varchar columns within those keys, although almost all are actually declared via a domain: basetype | count --+--- date |34 int | 146 numeric | 8 smallint | 365 varchar | 362 (5 rows) Complex queries usually join on at least one varchar column and one smallint column. Selection criteria usually involve comparing to a character string literal, sometimes a date literal, and almost always a smallint. Are we likely to see the 3% or the 7% performance degradation with version 8.3? Is the small patch likely to help with this usage pattern? I'm still trying to get 8.3 performance testing onto our schedule before the end (fast approaching, I know) of the beta. Is there anything in particular that would be helpful for me to look at regarding this particular issue in those tests? -Kevin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
Dave Page wrote: How does that work with the stuff that goes into directories relative to $libdir (per documented recommendations), without creating easy-to-break paths like $libdir/../../MyAddon/8.3/MyAddon.dll? Which documented recommendation do you speak of? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 26. November 2007 schrieb Mark Cave-Ayland: I'm working on a piece of code for PostGIS to allow the loading of projection configuration files from the share/postgresql directory, but The share directory is the wrong place for configuration files anyway. And moreover, non-PostgreSQL packages have no business putting files into PostgreSQL's private directories. That doesn't seem logical to me. We expect tsearch configuration files to be put into $sharedir/tsearch_data, so why shouldn't PostGIS use a similar approach to dealing with external configuration data? Well, PostGIS is not PostgreSQL. And those files are not really configuration files, in the sense that the user configures them, but data files (or else they are in the wrong place). -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] maintenance_work_mem memory constraint?
--On Montag, November 26, 2007 13:02:14 -0500 Tom Lane [EMAIL PROTECTED] wrote: Bernd Helmle [EMAIL PROTECTED] writes: ... But isn't it worth to special case the code in grow_memtuples() (and maybe other places where sort is likely to use more RAM), so that we can remove this constraint on 64-Bit systems with many RAM built in? Or am I missing something very important?. AFAICS this patch can increase the number of sortable tuples by at most 2X (less one). That doesn't seem worth getting very worked up about ... regards, tom lane That's true. Well, i haven't meant the diff as a discussable patch at all. It's just what i've done to understand why we have this limit for tuplesort. afaics, the main constraint here is MaxAllocSize, and i just wonder if that doesn't introduce unnecessary limits on systems which can use many RAM for index creation and wether we can be more generous here. So one idea could be to allow larger allocation requests during sorting on systems where we know that this is likely to work. -- Thanks Bernd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
Quoting Tom, from the previous thread linked by Martijn: It could be pretty ugly, because type assignment normally proceeds bottom-up :-(. What you might have to do is make the raw grammar representation of ARRAY[] work like A_Const does, ie, there's a slot to plug in a typecast. That's pretty much vestigial now for A_Const, if memory serves, but it'd be needful if ARRAY[] has to be able to see the typecast that would otherwise be above it in the parse tree. This approach is making sense to me, but I've run into a bit of a dependency issue. A_Const does indeed have a slot for typecasts by way of a TypeName member. A_Const and TypeName are both defined in parsenodes.h, whereas ArrayExpr is defined in primnodes.h. So unfortunately I can't just add a TypeName member to ArrayExpr. I'm new to this area of the codebase (and parsers generally), so I'm treading carefully. What would be the best way to resolve this? Would moving TypeName into primnodes.h be acceptable? Thanks for your time, BJ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
--- Original Message --- From: Peter Eisentraut [EMAIL PROTECTED] To: Dave Page [EMAIL PROTECTED] Sent: 26/11/07, 20:14:25 Subject: Re: [HACKERS] Locating sharedir in PostgreSQL on Windows Dave Page wrote: How does that work with the stuff that goes into directories relative to $libdir (per documented recommendations), without creating easy-to-break paths like $libdir/../../MyAddon/8.3/MyAddon.dll? Which documented recommendation do you speak of? http://www.postgresql.org/docs/8.3/static/xfunc-c.html states: == It is recommended to locate shared libraries either relative to $libdir or through the dynamic library path. This simplifies version upgrades if the new installation is at a different location. The actual directory that $libdir stands for can be found out with the command pg_config --pkglibdir. == That seems to have been there since at least 7.2. /D ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
Brendan Jurd [EMAIL PROTECTED] writes: This approach is making sense to me, but I've run into a bit of a dependency issue. A_Const does indeed have a slot for typecasts by way of a TypeName member. A_Const and TypeName are both defined in parsenodes.h, whereas ArrayExpr is defined in primnodes.h. So unfortunately I can't just add a TypeName member to ArrayExpr. That would be quite the wrong thing to do anyway, since ArrayExpr is a run-time representation and shouldn't have any such thing attached to it. What you probably need is a separate parse-time representation of ARRAY[], a la the difference between A_Const and Const. Another possibility is to just hack up a private communication path between transformExpr and transformArrayExpr, ie when you see TypeCast check to see if its argument is ArrayExpr and do something different. This would be a mite klugy but it'd be a much smaller patch that way. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
Dave Page [EMAIL PROTECTED] writes: From: Peter Eisentraut [EMAIL PROTECTED] Which documented recommendation do you speak of? http://www.postgresql.org/docs/8.3/static/xfunc-c.html states: It is recommended to locate shared libraries either relative to $libdir or through the dynamic library path. This simplifies version upgrades if the new installation is at a different location. The actual directory that $libdir stands for can be found out with the command pg_config --pkglibdir. I believe that that is talking specifically about shared libraries (or DLLs in Windows-speak), and not about configuration or data files. In particular, nothing under libdir would be a candidate to go under sharedir, nor vice versa, since the former is supposed to hold architecture-dependent files and the latter architecture-independent files. Mark hasn't been very clear about whether he wants to store static data files or installation-changeable configuration info, so it's not clear to me whether Peter's objection to using sharedir is appropriate or not. But unless the files are architecture-sensitive (which they might be!), libdir doesn't sound right in either case. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Kevin Grittner [EMAIL PROTECTED] writes: Are we likely to see the 3% or the 7% performance degradation with version 8.3? Probably not, since it sounds like your queries are typically not as trivial as the ones in Guillame's test case. IOW there will be some slowdown but it's likely to be in the noise for you, and also very likely made up by improvements elsewhere. The test case is in the unfortunate position of not being helped materially by *any* of the work we've done for 8.3. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Replacement Selection
[EMAIL PROTECTED] writes: Anyway, even in my RS implementation a longer run is created. The first M initialization elements will surely form part of the current run. M is the memory size so at least a run sized M will be created. After initialization, the elements are not suddenly output, but an element from heap is output into run as soon as I get an element from stream. In other words, for each element from stream, the root element of the heap is output, and the input element takes the root place into the heap. If that element is a good record I just heapify (since the element will be placed at the now free root place). If that input element is a dead record I swap it with the last leaf and reduce the heap size. AFAICS that produces runs that are *exactly* the same length as Knuth's method --- you're just using a different technique for detecting when the run is over, to wit record is not in heap vs record is in heap but with a higher run number. I guess you would save some comparisons while the heap is shrinking, but it's not at all clear that you'd save more than what it will cost you to re-heapify all the dead records once the run is over. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
On Mon, 2007-11-26 at 11:55 -0500, Tom Lane wrote: I'm not sure why Mark's having a problem accessing my_exec_path --- it *is* declared DLLIMPORT in miscadmin.h (which is where it counts, AIUI) clear back to 8.0. Bah, I think that is the source of the problem. Having grepped the source for my_exec_path, I found the reference in globals.c and hence the code worked on Linux (my main development environment) with a simple extern declaration which was what was confusing me. Adding the #include miscadmin.h solves this problem and I can now access the variable on Windows aswell, so I put this down to user error - thanks for the advice Tom. Many thanks, Mark. -- ILande - Open Source Consultancy http://www.ilande.co.uk ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
--- Original Message --- From: Tom Lane [EMAIL PROTECTED] To: Dave Page [EMAIL PROTECTED] Sent: 26/11/07, 22:02:09 Subject: Re: [HACKERS] Locating sharedir in PostgreSQL on Windows I believe that that is talking specifically about shared libraries (or DLLs in Windows-speak), and not about configuration or data files. In particular, nothing under libdir would be a candidate to go under sharedir, nor vice versa, since the former is supposed to hold architecture-dependent files and the latter architecture-independent files. Yes, I know. Peter seemed to be saying that nothing except postgres itself should be in *any* of the installation directories (he called them 'private' directories by which I imagine he meant an install in /usr/local/pgsql/ vs /usr/local, or more specifically the case when $libdir is something like /usr/lib/postgresql) /D ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
Dave Page [EMAIL PROTECTED] writes: Yes, I know. Peter seemed to be saying that nothing except postgres itself should be in *any* of the installation directories Yeah, that's what I think he said too, but it strikes me as a completely bogus policy --- what about contrib modules or stuff from pgfoundry or any random user-written module that was built with PGXS? All that stuff happily drops files under $libdir and $sharedir, and I see no good argument why it shouldn't. There is fair room for argument about whether $sharedir is the appropriate place for installation-specific config files though. Those should probably live in the same place as postgresql.conf and friends, ie, typically but not necessarily $PGDATA. It doesn't look like SelectConfigFiles() saves the config directory name anyplace where an add-on module could get at it. We could perhaps fix that for 8.3, but it won't help Mark if he insists on a backwards-compatible solution. I suppose he could copy ConfigFileName and strip the last component from it? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
On Nov 27, 2007 8:04 AM, Tom Lane [EMAIL PROTECTED] wrote: Brendan Jurd [EMAIL PROTECTED] writes: ... So unfortunately I can't just add a TypeName member to ArrayExpr. That would be quite the wrong thing to do anyway, since ArrayExpr is a run-time representation and shouldn't have any such thing attached to it. What you probably need is a separate parse-time representation of ARRAY[], a la the difference between A_Const and Const. Ah. I wasn't aware of the distinction; I started by looking in gram.y and saw that the ARRAY parse path creates an ArrayExpr node, whilst the constant parse paths create A_Const nodes. I didn't realise that ArrayExpr was skipping ahead and creating the same kind of object that the transform produces. Glad I stopped and asked for directions then. =) I'm not 100% clear on what the A_ prefix signifies ... is A_ArrayExpr a good name for the parse-time structure? Thanks for your time, BJ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
Brendan Jurd [EMAIL PROTECTED] writes: I'm not 100% clear on what the A_ prefix signifies ... is A_ArrayExpr a good name for the parse-time structure? Yeah, might as well use that for consistency. The A_ doesn't seem very meaningful to me either, but I don't want to rename the existing examples ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Replacement Selection
Tom Lane [EMAIL PROTECTED] writes: AFAICS that produces runs that are *exactly* the same length as Knuth's method --- you're just using a different technique for detecting when the run is over, to wit record is not in heap vs record is in heap but with a higher run number. I guess you would save some comparisons while the heap is shrinking, but it's not at all clear that you'd save more than what it will cost you to re-heapify all the dead records once the run is over. This sounded familiar... It sounds a lot like what this CVS log message is describing as a mistaken idea: revision 1.2 date: 1999-10-30 18:27:15 +0100; author: tgl; state: Exp; lines: +423 -191; Further performance improvements in sorting: reduce number of comparisons during initial run formation by keeping both current run and next-run tuples in the same heap (yup, Knuth is smarter than I am). And, during merge passes, make use of available sort memory to load multiple tuples from any one input 'tape' at a time, thereby improving locality of access to the temp file. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Table inheritance, unique constraints and foreign key problem
Hello, this issue has been requested and its on the TODO-list. Since I really need foreign key constraints on inherited tables, I have two solutions: Adding some hackish RULES/TRIGGERS to my tables or implementing it myself. It think the latter is better. However, I have no experience in implementing such a feature in Postgres. I have written some triggers in C (actually C++) using the SPI_-functions, but that's it. I have a running 8.3beta, checked out from the repository, and I have read the Ottawa slides. Can someone tell me in a few lines, where to start with such a feature. Jacob ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
On Mon, 2007-11-26 at 17:02 -0500, Tom Lane wrote: I believe that that is talking specifically about shared libraries (or DLLs in Windows-speak), and not about configuration or data files. In particular, nothing under libdir would be a candidate to go under sharedir, nor vice versa, since the former is supposed to hold architecture-dependent files and the latter architecture-independent files. Mark hasn't been very clear about whether he wants to store static data files or installation-changeable configuration info, so it's not clear to me whether Peter's objection to using sharedir is appropriate or not. But unless the files are architecture-sensitive (which they might be!), libdir doesn't sound right in either case. Okay, I'll try and expand on this a bit. In order to convert coordinates between different coordinate systems, PostGIS uses the external PROJ.4 library. Now in order to support a certain category of conversion, PROJ.4 requires access to a set of library grid reference files which are effectively compiled from source files into a set of data files as part of the build process. The path to this directory of files is then built into the DLL at compile time, although it can be overriden with an API call. Under Linux, this is fairly easy as the files are normally installed somewhere under /usr/share/proj, and hence the directory exists at both compile-time and run-time. Windows is trickier because drive letters and mappings can change - the default of C:\PROJ\NAD may or may not exist, or can change depending upon the current drive configuration. I can also see issues arising if the PostgreSQL installation is moved from the C:\ drive to another. Hence my idea was to create a directory under $sharedir such as $sharedir/postgresql/contrib/postgis/nad and install the files there. Then regardless of the location of the PostgreSQL installation or the current drive setup, I can use get_share_path() with the PROJ.4 API to set the new library path the first time the function is called, and everything will just work. I can see Peter's argument about not putting files directly in $sharedir, but I feel the usage suggested above falls under a similar use case to the tsearch2 data files (which is mostly where I looked for inspiration). Hopefully this will help make things a bit clearer - please let me know if any more information is needed. Many thanks, Mark. -- ILande - Open Source Consultancy http://www.ilande.co.uk ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
--- Original Message --- From: Tom Lane [EMAIL PROTECTED] To: Dave Page [EMAIL PROTECTED] Sent: 26/11/07, 22:30:17 Subject: Re: [HACKERS] Locating sharedir in PostgreSQL on Windows Dave Page [EMAIL PROTECTED] writes: Yes, I know. Peter seemed to be saying that nothing except postgres itself should be in *any* of the installation directories Yeah, that's what I think he said too, but it strikes me as a completely bogus policy --- what about contrib modules or stuff from pgfoundry or any random user-written module that was built with PGXS? All that stuff happily drops files under $libdir and $sharedir, and I see no good argument why it shouldn't. Yup, exactly. There is fair room for argument about whether $sharedir is the appropriate place for installation-specific config files though. Those should probably live in the same place as postgresql.conf and friends, ie, typically but not necessarily $PGDATA. It doesn't look like SelectConfigFiles() saves the config directory name anyplace where an add-on module could get at it. We could perhaps fix that for 8.3, but it won't help Mark if he insists on a backwards-compatible solution. I suppose he could copy ConfigFileName and strip the last component from it? Wouldn't be difficult, but I suspect Mark is probably using the path for the PostGIS SQL scripts lwgeom.sql, spatial_ref_sys.sql etc. rather than config files. But, I see he's found his bug now anyway. /D ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Replacement Selection
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I guess you would save some comparisons while the heap is shrinking, but it's not at all clear that you'd save more than what it will cost you to re-heapify all the dead records once the run is over. This sounded familiar... It sounds a lot like what this CVS log message is describing as a mistaken idea: Wow, I had forgotten all about that; but yeah this sounds exactly like my first-cut rewrite of PG's sorting back in 1999. I have some vague memory of having dismissed Knuth's approach as being silly because of the extra space and (small number of) cycles needed to compare run numbers in the heap. I hadn't realized that there was an impact on total number of comparisons required :-( The discussion from that time period in pgsql-hackers makes it sound like you need a large test case to notice the problem, though. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
Mark Cave-Ayland [EMAIL PROTECTED] writes: Okay, I'll try and expand on this a bit. In order to convert coordinates between different coordinate systems, PostGIS uses the external PROJ.4 library. Now in order to support a certain category of conversion, PROJ.4 requires access to a set of library grid reference files which are effectively compiled from source files into a set of data files as part of the build process. The path to this directory of files is then built into the DLL at compile time, although it can be overriden with an API call. Under Linux, this is fairly easy as the files are normally installed somewhere under /usr/share/proj, and hence the directory exists at both compile-time and run-time. OK, if the files are normally under /usr/share then it's presumably kosher to put them under our $sharedir. I guess the only question is whether you are worried about having two copies in a machine where PROJ.4 is also installed natively. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Mon, 2007-11-26 at 09:55 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: But I think there must be an action that we can take for 8.3 and that much runtime should not be given away easily. ISTM that we can win back the losses Guillaume has identified, plus gain a little more even. Perhaps some sanity could be restored to this discussion by pointing out that the 2007-01-01 code *also* clocks in at 37% spent in oper_select_candidate. IOW it's been like this for a very long time. [I'm replying to the wrong message, I know.] Here's where I am: Basic test was to replace call to oper_select_candidate with a single item that was fed by a hardcoded value for varchar equality operator. This is the oper_cache.v1.patch enclosed; the 5 line patch. Test results were - w/o patch ~10,500 tps with pgbench_varchar.sql - with patch~15,500 tps with pgbench_varchar.sql (**big gain**) - w/o patch ~16,250 tps with pgbench_integer.sql - with patch~16,250 tps with pgbench_integer.sql Tables are standard pgbench, varchar test table created using: create table av as select aid::varchar, bid, abalance, filler from accounts; create unique index av_pkey on av (aid); The impact of calling oper_select_candidate() is big enough that it will affect any query that is read only and has 1 or 2 predicates when at least one of them is a VARCHAR_col = const query. What I'm actually proposing is a patch implementing a oper_select_hook function pointer, which allows the user to do anything they want. I'm just re-writing that as a plugin now, but the backend patch is included here for discussion. oper_select_hook.v1.patch -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com Index: src/backend/parser/parse_oper.c === RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/parser/parse_oper.c,v retrieving revision 1.98 diff -c -r1.98 parse_oper.c *** src/backend/parser/parse_oper.c 22 Nov 2007 19:40:25 - 1.98 --- src/backend/parser/parse_oper.c 26 Nov 2007 16:28:42 - *** *** 28,33 --- 28,34 #include utils/syscache.h #include utils/typcache.h + static Oid operOidCache = InvalidOid; static Oid binary_oper_exact(List *opname, Oid arg1, Oid arg2); static FuncDetailCode oper_select_candidate(int nargs, *** *** 516,534 /* No operators found? Then fail... */ if (clist != NULL) { ! /* ! * Unspecified type for one of the arguments? then use the other ! * (XXX this is probably dead code?) ! */ ! Oid inputOids[2]; ! if (rtypeId == InvalidOid) ! rtypeId = ltypeId; ! else if (ltypeId == InvalidOid) ! ltypeId = rtypeId; ! inputOids[0] = ltypeId; ! inputOids[1] = rtypeId; ! fdresult = oper_select_candidate(2, inputOids, clist, operOid); } } --- 517,543 /* No operators found? Then fail... */ if (clist != NULL) { ! if (OidIsValid(operOidCache)) ! operOid = operOidCache; ! else ! { ! /* ! * Unspecified type for one of the arguments? then use the other ! * (XXX this is probably dead code?) ! */ ! Oid inputOids[2]; ! ! if (rtypeId == InvalidOid) ! rtypeId = ltypeId; ! else if (ltypeId == InvalidOid) ! ltypeId = rtypeId; ! inputOids[0] = ltypeId; ! inputOids[1] = rtypeId; ! fdresult = oper_select_candidate(2, inputOids, clist, operOid); ! if (ltypeId == 1043 rtypeId == 705) ! operOidCache = operOid; ! } } } \set naccounts 10 * :scale \setrandom aid 1 :naccounts SELECT abalance FROM accounts WHERE aid = :aid; \set naccounts 10 * :scale \setrandom aid 1 :naccounts SELECT abalance FROM av WHERE aid = ':aid'; Index: src/backend/parser/parse_oper.c === RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/parser/parse_oper.c,v retrieving revision 1.98 diff -c -r1.98 parse_oper.c *** src/backend/parser/parse_oper.c 22 Nov 2007 19:40:25 - 1.98 --- src/backend/parser/parse_oper.c 26 Nov 2007 23:08:09 - *** *** 28,33 --- 28,34 #include utils/syscache.h #include utils/typcache.h + oper_select_hook_type oper_select_hook = NULL; static Oid binary_oper_exact(List *opname, Oid arg1, Oid arg2); static FuncDetailCode oper_select_candidate(int nargs, *** *** 528,534 ltypeId = rtypeId; inputOids[0] = ltypeId; inputOids[1] = rtypeId; ! fdresult = oper_select_candidate(2, inputOids, clist, operOid); } } --- 529,544 ltypeId = rtypeId; inputOids[0] = ltypeId; inputOids[1] = rtypeId; ! ! /* ! * Allow user defined operator selection, allowing hard-coding ! * cacheing or other mechanisms of operator selection for ! * improved performance in certain circumstances ! */ ! if (oper_select_hook) ! fdresult = (* oper_select_hook) (2,
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Simon Riggs [EMAIL PROTECTED] writes: Here's where I am: Basic test was to replace call to oper_select_candidate with a single item that was fed by a hardcoded value for varchar equality operator. Well, that confirms what we knew from gprof, but surely you aren't proposing that as a usable patch. What I'm actually proposing is a patch implementing a oper_select_hook function pointer, which allows the user to do anything they want. Why in the world would that be a good idea? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale
Bruce Momjian wrote: D'Arcy J.M. Cain wrote: On Sat, 24 Nov 2007 11:27:38 -0500 (EST) Bruce Momjian [EMAIL PROTECTED] wrote: I am confused about two other items with MONEY. First, why can't anything but a string be cast to this type? test= select 871234872319489323::money; ERROR: cannot cast type bigint to money LINE 1: select 871234872319489323::money; ^ test= select 871234872::money; ERROR: cannot cast type integer to money LINE 1: select 871234872::money; ^ test= select 87123487231.3::money; ERROR: cannot cast type numeric to money LINE 1: select 87123487231.3::money; ^ I agree. I wasn't the one that added the meta information. OK, so the big question then is if we are un-depricating this data type, does it have the behavior we want? (And the regression addition will be helpful too.) Added to TODO list: * Allow MONEY to be cast to/from other numeric data types -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Mon, 2007-11-26 at 18:18 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Here's where I am: Basic test was to replace call to oper_select_candidate with a single item that was fed by a hardcoded value for varchar equality operator. Well, that confirms what we knew from gprof, but surely you aren't proposing that as a usable patch. gprof might not have translated into a usable gain, but clearly it can. That's not a proposed patch, just showing my results. What I'm actually proposing is a patch implementing a oper_select_hook function pointer, which allows the user to do anything they want. Why in the world would that be a good idea? Short answer: it makes it go faster? You asked. ;-) Long answer: We all agree the operator cache is the best answer, yet don't wish to delay the project or make it less robust. The best answer is a plugin approach that lets users take the risk and make the gain. We can't hardcode it for everybody because that runs completely against the grain of Postgres. Including this as a plugin allows people to make their own decisions about cacheing/hardcoding. If you are the unlucky owner of a database with a heavy read workload and lots of VARCHAR keys then you're going to want this. The plugin allows writing a one-slot cache that is never flushed. If you choose to override the operators then you'd need to reconnect. It also allows some performance tuning in other cases too, so having it as a general case makes sense. The overhead of implementing it this way is very close to zero and the code path doesn't even get called in the integers-as-keys cases. I don't really like all of this, but that much gain is too much for me to ignore. Better ideas eagerly accepted, and encouraged. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: But I think there must be an action that we can take for 8.3 and that much runtime should not be given away easily. ISTM that we can win back the losses Guillaume has identified, plus gain a little more even. Perhaps some sanity could be restored to this discussion by pointing out that the 2007-01-01 code *also* clocks in at 37% spent in oper_select_candidate. IOW it's been like this for a very long time. I'm not interested in destabilizing 8.3 with panicky last-minute patches. So how about we have a cache-of-one: Cache-of-one has exactly the same difficulty as cache-of-many, other than the table lookup itself, which is a solved problem (hashtable). You still have to determine how you identify the cached value and what events require a cache flush. Nor do I see any particular reason to assume that a cache of only one operator would be of any use for real-world apps, as opposed to toy examples. Seems like anytime a function like that takes 37%, there is something wrong. Are we sure there isn't a bug in there somewhere? As far as a cache, could we create a simple cache that remembered the last X lookups and cleared the cache anytime a cache invalidation message came in? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Table inheritance, unique constraints and foreign key problem
Jacob Rief [EMAIL PROTECTED] writes: this issue has been requested and its on the TODO-list. Since I really need foreign key constraints on inherited tables, I have two solutions: Adding some hackish RULES/TRIGGERS to my tables or implementing it myself. It think the latter is better. However, I have no experience in implementing such a feature in Postgres. I have written some triggers in C (actually C++) using the SPI_-functions, but that's it. I have a running 8.3beta, checked out from the repository, and I have read the Ottawa slides. Can someone tell me in a few lines, where to start with such a feature. This is the tip of an iceberg. As you dig you find out it's caused by deeper and deeper limitations until you're pretty much all of the executor. The RI trigger code explicitly uses ONLY for the integrity checks. But if you remove that you find it breaks because you get this message: ERROR: SELECT FOR UPDATE/SHARE is not supported for inheritance queries Look at src/backend/optimizer/path/allpaths.c:287 for a comment about this. I'm a bit puzzled myself why this affects SELECT FOR UPDATE/SHARE but not straight UPDATES and DELETES. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale
Bruce Momjian [EMAIL PROTECTED] writes: Added to TODO list: * Allow MONEY to be cast to/from other numeric data types So in other words, that's been added to the TODO list *purely* on your own say-so, and not because any users asked for it or anyone else thinks it's a good idea. Since MONEY really ought to be considered a tagged type, I'm not at all impressed with the idea that there should be default casts between it and plain numerics. There's a fundamental semantic gap there. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Table inheritance, unique constraints and foreign key problem
Gregory Stark [EMAIL PROTECTED] writes: I'm a bit puzzled myself why this affects SELECT FOR UPDATE/SHARE but not straight UPDATES and DELETES. In straight UPDATE/DELETE we have enough structure in the query to know how to associate each tuple returned to the executor top level with exactly one tuple in exactly one target table (which is where to apply the tuple lock operation). We don't have that much structure in general SELECT --- for example, what to do with null-filled rows in a LEFT JOIN, or cases where one row gives rise to more than one joined row, or aggregation or UNION? Some of these cases can probably be rejected as unsupportable, but it'll still take a lot of work. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mon, 26 Nov 2007 21:19:48 -0500 Tom Lane [EMAIL PROTECTED] wrote: Bruce Momjian [EMAIL PROTECTED] writes: Added to TODO list: * Allow MONEY to be cast to/from other numeric data types So in other words, that's been added to the TODO list *purely* on your own say-so, and not because any users asked for it or anyone else thinks it's a good idea. Well if we are going to continue to support money (which I am against) we should support the casting to numeric as that is by far a more common implementation of money and we will have mixed environments. Since MONEY really ought to be considered a tagged type, I'm not at all impressed with the idea that there should be default casts between it and plain numerics. There's a fundamental semantic gap there. Perhaps but the practical gap is much narrower. Sincerely, Joshua D. Drake regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHS4QhATb/zqfZUUQRAraiAJ9HLeqG7nRbblPvJhu/JQwhOrmzxQCgpWGE 1JShnZ4xwM1+lQzTKCkGVcw= =sErU -END PGP SIGNATURE- ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale
Joshua D. Drake [EMAIL PROTECTED] writes: Well if we are going to continue to support money (which I am against) we should support the casting to numeric as that is by far a more common implementation of money and we will have mixed environments. So, you don't use MONEY, and you don't want to, but nonetheless you know better than the people who do use MONEY what they need. Aside from the semantic-gap issue, there is the point that providing a cast might actually mask application errors. I can well imagine cases where one of the reasons for using MONEY is *exactly* that it's not a plain number or easily convertible to one. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Proposed patch for operator lookup caching
Tom Lane wrote: Since Simon seems intent on hacking something in there, here is a patch that I think is actually sane for improving operator lookup speed. This patch caches all lookups, exact or ambiguous (since even the exact ones require multiple cache searches in common cases); and behaves sanely in the presence of search_path, pg_operator, or pg_cast changes. I see about a 45% speedup (2110 vs 1445 tps) on Guillame Smet's test case. On straight pgbench --- which has no ambiguous operators, plus it's not read-only --- it's hard to measure any consistent speedup, but I can say that it's not slower. Some other test cases would be nice. I went through the code that's being bypassed in some detail, to see what dependencies were being skipped over. I think that as long as we assume that no *existing* type changes its domain base type, typtype, array status, type category, or preferred-type status, we don't need to flush the cache on pg_type changes. This is a good thing since pg_type changes frequently (eg, at temp table create or drop). The only case that I believe to be unhandled is that the cache doesn't pay attention to ALTER TABLE ... INHERIT / NO INHERIT events. This means it is theoretically possible to return the wrong operator if an operator takes a complex type as input and the calling situation involves another complex type whose inheritance relationship to that one changes. That's sufficiently far out of the normal case that I'm not very worried about it (in fact, we probably have bugs in that area even without this patch, since for instance cached plans don't respond to such changes either). We could plug the hole by forcing a system-wide cache reset during ALTER TABLE ... INHERIT / NO INHERIT, if anyone insists. I'm not entirely happy about applying a patch like this so late in the beta cycle, but I'd much rather do this than than any of the less-than-half-baked ideas that have been floated in the discussion so far. Thanks for the patch. I can see it is clearly of significant size. I also noted that you found that the case of: SELECT col FROM tab WHERE text_col = 'ABC'; also took 37% of CPU in January, I think meaning we had this problem in 8.2. On the one hand we have a pretty significant patch that we might apply. It gives us a major speedup (+30%) for a common query type. I assume 8.3 was slightly slower than 8.2 only because we have a few more pg_catalog entries in 8.3 than 8.2. (I am still baffled how a lookup function could take so much CPU compared to what else is done for a query.) We are also talking about catlog changes for 8.3. Are we comfortable doing catalog changes between the beta and RC? I am wondering if the right plan is to have someone else review your patch, apply it, make the catalog changes, and release another beta this weekend. Give the beta one week of testing and go for RC. That gives us testing of the patch, and testing of the catalog changes before going to RC1. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Proposed patch for operator lookup caching
Bruce Momjian [EMAIL PROTECTED] writes: We are also talking about catlog changes for 8.3. Are we comfortable doing catalog changes between the beta and RC? The catalog changes in question seem entirely safe ... certainly much more so than this patch ... I do see your point that another beta might be prudent, but on the other hand I'm not sure it's really needed. The only difference between a beta and an RC is that we try not to change the code anymore after RC. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale
Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Well if we are going to continue to support money (which I am against) we should support the casting to numeric as that is by far a more common implementation of money and we will have mixed environments. So, you don't use MONEY, and you don't want to, but nonetheless you know better than the people who do use MONEY what they need. I found out you can cast numerics to MONEY by using two casts: test= SELECT 12321.12::text::money; money $12,321.12 (1 row) For some reason this doesn't work in 8.2 but does in 8.3. The reverse doesn't work: test= SELECT '12321.12'::money::text::numeric; ERROR: invalid input syntax for type numeric: $12,321.12 The big problem is that MONEY is a string so the dollar sign and commas are a problem. Aside from the semantic-gap issue, there is the point that providing a cast might actually mask application errors. I can well imagine cases where one of the reasons for using MONEY is *exactly* that it's not a plain number or easily convertible to one. Perhaps all we need is a way to accomplish the casting so it isn't automatic. This works: test= SELECT regexp_replace('2343'::money::text, '[^$,]*', '', 'g')::numeric; regexp_replace 2343.00 (1 row) but the '$' and ',' are locale-specific and I can't think of a way to do this generically. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mon, 26 Nov 2007 23:47:04 -0500 (EST) Bruce Momjian [EMAIL PROTECTED] wrote: Perhaps all we need is a way to accomplish the casting so it isn't automatic. This works: test= SELECT regexp_replace('2343'::money::text, '[^$,]*', '', 'g')::numeric; regexp_replace 2343.00 (1 row) but the '$' and ',' are locale-specific and I can't think of a way to do this generically. With a regex? Joshua D. Drake - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHS6RjATb/zqfZUUQRAtZyAJ9VZeCzjX+RSGr3A9eWXTcbHwnc4gCgoZcH LcO9KnjcycRo5YjCektSJXg= =HZpl -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale
Joshua D. Drake wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mon, 26 Nov 2007 23:47:04 -0500 (EST) Bruce Momjian [EMAIL PROTECTED] wrote: Perhaps all we need is a way to accomplish the casting so it isn't automatic. This works: test= SELECT regexp_replace('2343'::money::text, '[^$,]*', '', 'g')::numeric; regexp_replace 2343.00 (1 row) but the '$' and ',' are locale-specific and I can't think of a way to do this generically. With a regex? The problem is there is no locale-independent way to determine if '123.456' is ~123k or ~123. I think we are going to need a way to output the MONEY value without a currency and thousands symbols. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Proposed patch for operator lookup caching
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: We are also talking about catlog changes for 8.3. Are we comfortable doing catalog changes between the beta and RC? The catalog changes in question seem entirely safe ... certainly much more so than this patch ... I do see your point that another beta might be prudent, but on the other hand I'm not sure it's really needed. The only difference between a beta and an RC is that we try not to change the code anymore after RC. To me RC means we think this might be the release candidate and I would like to get some testing in of this in beta before hitting that point. And an additional beta might encourage more testing too. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Proposed patch for operator lookup caching
Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: We are also talking about catlog changes for 8.3. Are we comfortable doing catalog changes between the beta and RC? The catalog changes in question seem entirely safe ... certainly much more so than this patch ... I do see your point that another beta might be prudent, but on the other hand I'm not sure it's really needed. The only difference between a beta and an RC is that we try not to change the code anymore after RC. To me RC means we think this might be the release candidate and I would like to get some testing in of this in beta before hitting that point. And an additional beta might encourage more testing too. I agree with Bruce here. If you want to apply that operator lookup cache patch, I would have another beta. (And I am not personally against it, because I feel major performance fixes may sometimes slip in as bug fixes.) If you all decide against that patch, we might as well just go for RC1. The catalog changes seem rather trivial, and just a required initdb is no reason for calling it another beta, IMHO. Great work on that patch, btw.! Best Regards Michael Paesold ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Locating sharedir in PostgreSQL on Windows
Tom Lane wrote: Yeah, that's what I think he said too, but it strikes me as a completely bogus policy --- what about contrib modules or stuff from pgfoundry or any random user-written module that was built with PGXS? All that stuff happily drops files under $libdir and $sharedir, and I see no good argument why it shouldn't. We have an implicit policy for those two specific cases, which are OK with me. I guess it's better to look at specific use cases instead of interpreting other people's abstract directory descriptions. ;-) -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Fixes for MONEY type using locale
Tom Lane wrote: Aside from the semantic-gap issue, there is the point that providing a cast might actually mask application errors. I can well imagine cases where one of the reasons for using MONEY is *exactly* that it's not a plain number or easily convertible to one. I'm always against casts, but I can hardly object to an explicit cast between money and numeric. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] developing for psqlodbc
i'd like to try to develop some improvements to psqlodbc. is there source for a test suite available? could anyone point me towards it, or offer testing source code they've written personally? thanks, john ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster