[HACKERS] help wanted (fmgr.c)
I am a little stuck of a question. In fmgr.c:1698, function InputFunctionCall at line 1718 /* Should get null result if and only if str is NULL */ if (str == NULL) { What are we testing to be NULL here? Do we expect str to changed at line 1715 ( result = FunctionCallInvoke(fcinfo); ) Please advice, Gevik. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [Pgsqlrpms-hackers] [HACKERS] Safer auto-initdb for RPM init
On Sat, 9 Sep 2006 at 15:57, Lamar Owen wrote: [...] or annoying the small number of people who NFS mount their datadirs? This problem is not limited to NFS. It can happen with any FS just by reversing (for whatever reason) the order of mounting the FS and starting the PostgreSQL server. But the SUSE RPMs have the auto-initdb feature since the PostgreSQL 7.0 days (almost six years), and AFAIR I never got a single complaint about this sort of problem. cu Reinhard ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] help wanted (fmgr.c)
At 2006-09-11 10:25:22 +0200, [EMAIL PROTECTED] wrote: What are we testing to be NULL here? Do we expect str to changed at line 1715 No. (Read the comment just above the function.) The code is like this, starting from line 1703: if (str == NULL flinfo-fn_strict) return (Datum) 0; That is, if the function is declared strict, and the argument (str) is 0, just return NULL straightaway. Then it sets up the fcinfo and calls the function, and then: ... /* Should get null result if and only if str is NULL */ if (str == NULL) { if (!fcinfo.isnull) elog(ERROR, input function %u returned non-NULL, fcinfo.flinfo-fn_oid); } else { if (fcinfo.isnull) elog(ERROR, input function %u returned NULL, fcinfo.flinfo-fn_oid); } This says: If the argument is NULL and the input function didn't return a NULL, log an error; but if the argument is non-NULL and the function returned NULL, log this other error. (Note that a function would set fcinfo-isnull to indicate that it wants to return an SQL NULL, as explained in $DOC/plhandler.html) -- ams ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Cassowary failing to report the results back to the farm
Hello, I'm trying to put the cassowary buildfarm member back to work (it's been inactive for almost a month because i've moved to another project and switched the machine). The run_build script has trouble with sending the test results. The error is : Status Line: 491 bad ts parameter - 1157995969 (Mon Sep 11 17:32:49 2006 GMT) is in the future. Content: bad ts parameter - 1157995969 (Mon Sep 11 17:32:49 2006 GMT) is in the future Web txn failed with status: 1 It's not clear to me where does that date-in-the-future come from. The machine's date is set correctly: $ date Mon Sep 11 11:00:30 PST 2006 Any ideas about what might cause this? -- Adrian Maier ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Cassowary failing to report the results back to the
On Mon, 11 Sep 2006, Adrian Maier wrote: It's not clear to me where does that date-in-the-future come from. The machine's date is set correctly: $ date Mon Sep 11 11:00:30 PST 2006 Um, no. I am currently in the PST time zone, and I can say from first-hand experience that the current time is 2:21 am, not 11 am. I have confirmed this by looking out the window and noticing a distinct lack of light. The time you have quoted is about 8.5 hours in the future. Suggest you either verify your time zone, or look out your window ;) Any ideas about what might cause this? -- If you can lead it to water and force it to drink, it isn't a horse. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Cassowary failing to report the results back to the farm
On 9/11/06, Jeremy Drake [EMAIL PROTECTED] wrote: On Mon, 11 Sep 2006, Adrian Maier wrote: It's not clear to me where does that date-in-the-future come from. The machine's date is set correctly: $ date Mon Sep 11 11:00:30 PST 2006 Um, no. I am currently in the PST time zone, and I can say from first-hand experience that the current time is 2:21 am, not 11 am. I have confirmed this by looking out the window and noticing a distinct lack of light. The time you have quoted is about 8.5 hours in the future. Suggest you either verify your time zone, or look out your window ;) Thanks Jeremy, You are right, I hadn't realised that 'PST' indicates a timezone (the wrong one..). I'll have to change the timezone to GMT+2 . Sorry for the noise ! -- Adrian Maier ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Fixed length data types issue
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: I'm a bit confused by this and how it would be handled in your sketch. I assumed we needed a bit pattern dedicated to 4-byte length headers because even though it would never occur on disk it would be necessary to for the uncompressed and/or detoasted data. In your scheme what would PG_GETARG_TEXT() give you if the data was detoasted to larger than 16k? I'm imagining that it would give you the same old uncompressed in-memory representation as it does now, ie, 4-byte length word and uncompressed data. Sure, but how would you know? Sometimes you would get a pointer to a varlena starting with a bytes with a leading 00 indicating a 1-byte varlena header and sometimes you would get a pointer to a varlena with the old uncompressed representation with a 4-byte length header which may well start with a 00. * If high order bit of first byte is 1, then it's some compressed variant. I'd propose divvying up the code space like this: * 0xxx uncompressed 4-byte length word as stated above * 10xx 1-byte length word, up to 62 bytes of data * 110x 2-byte length word, uncompressed inline data * 1110 2-byte length word, compressed inline data * 1-byte length word, out-of-line TOAST pointer I'm unclear how you're using the remaining bits. Are you saying you would have a 4-byte length word following this bit-flag byte? Or are you saying we would use 31 bits for the 4-byte length word, 13 bits for the 2-byte uncompressed length word and 12 bits for the compressed length word? Also Heikki points out here that it would be nice to allow for the case for a 0-byte header. So for example if the leading bit is 0 then the remaining 7 bits are available for the datum itself. This would actually vacate much of my argument for a fixed length char(n) data type. The most frequent use case is for things like CHAR(1) fields containg 'Y' or 'N'. In any case it seems a bit backwards to me. Wouldn't it be better to preserve bits in the case of short length words where they're precious rather than long ones? If we make 0xxx the 1-byte case it means limiting our maximum datum size to something like .5G but if you're working with .5G data wouldn't you be using an api that lets you access it by chunks anyways? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Fixed length data types issue
Hi, Tom, Tom Lane wrote: The only way we could pack stuff without alignment is to go over to the idea that memory and disk representations are different --- where in this case the conversion might just be a memcpy to a known-aligned location. The performance costs of that seem pretty daunting, however, especially when you reflect that simply stepping over a varlena field would require memcpy'ing its length word to someplace. AFAICS, PostGIS already uses this approach internally, mostly because its current format requires a mix of byte-sized and larger (int, double) fields. Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org signature.asc Description: OpenPGP digital signature
[HACKERS] Optimizer improvements: to do or not to do?
I intend to play with some optimizer aspects. Just for fun. I'm a novice in the DBMS development so I can not promise any available results but if it can be useful even as yet another failed attempt I will try. That's what I want to do: 1. Replace not very useful indexCorrelation with indexClustering. 2. Consider caching of inner table in a nested loops join during estimation total cost of the join. More details: 1. During analyze we have sample rows. For every N-th sample row we can scan indices on qual like 'value = index_first_column' and fetch first N row TIDs. To estimate count of fetched heap pages is not hard. To take the index clustering value just divide the pages count by the sample rows count. 2. It's more-more harder and may be impossible to me at all. The main ideas: - split page fetches cost and CPU cost into different variables and don't summarize it before join estimation. - final path cost estimation should be done in the join cost estimation and take into account number of inner table access (=K). CPU cost is directly proportionate to K but page fetches can be estimated by Mackert and Lohman formula using the total tuples count (K * inner_table_selectivity * inner_table_total_tuples). Any thoughts? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Fixed length data types issue
Tom Lane [EMAIL PROTECTED] writes: Mark Dilger [EMAIL PROTECTED] writes: ... The argument made upthread that a quadratic number of conversion operators is necessitated doesn't seem right to me, given that each type could upcast to the canonical built in type. (int1 = smallint, int3 = integer, ascii1 = text, ascii2 = text, ascii3 = text, etc.) This would work all right for the string-category cases, since TEXT is the only thing you really care about having them cast to anyway. It probably won't work all that well for int1/int3, because you really want them to coerce implicitly to all the wider numeric types. Otherwise, perfectly sane queries like int8 + int1 fail. Actually that one works, you get numeric addition which then would get cast back to an int8 if you tried to store it in an int8 column. Not necessarily the most efficient way to go about it though. However you do have to provide all the cross-data-type comparisons if you want indexes to work right and that alone gives you a couple hundred catalog entries. Part of the issue here is that we deliberately keep the parser from searching for multi-step coercions. So for example if you only provide int1-int2 then the existence of up-casts from int2 doesn't help you use an int1 with anything except int2. After my initial plea for multi-step coercions I've thought about it a bit further and I think I can make a stronger case for them now: Consider that in the current situation there's an asymmetry between function calls and casts. If you call a function and there's an implicit cast that matches the argument then we'll use it. But if you use a cast implemented with that same function it won't work. For example say you implement a numeric data type called mynumber and you define function mynumber(numeric) = mynumber and use it to implement the cast numeric::mynumber. You'll find mynumber(integer) works just fine but integer::mynumber doesn't. To make the situation parallel we would had to find two-step casts only for explicit casts and if and only if the intermediate cast is an implicit cast. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Cassowary failing to report the results back to the
Adrian Maier wrote: On 9/11/06, Jeremy Drake [EMAIL PROTECTED] wrote: On Mon, 11 Sep 2006, Adrian Maier wrote: It's not clear to me where does that date-in-the-future come from. The machine's date is set correctly: $ date Mon Sep 11 11:00:30 PST 2006 Um, no. I am currently in the PST time zone, and I can say from first-hand experience that the current time is 2:21 am, not 11 am. I have confirmed this by looking out the window and noticing a distinct lack of light. The time you have quoted is about 8.5 hours in the future. Suggest you either verify your time zone, or look out your window ;) Thanks Jeremy, You are right, I hadn't realised that 'PST' indicates a timezone (the wrong one..). I'll have to change the timezone to GMT+2 . Sorry for the noise ! FYI, the buildfarm scripts all ignore timezone completely. All time calculations are done in terms of UTC (a.k.a. GMT for these purposes). The result you saw was from some sanity checks built into the web app. All machines that report buildfarm results should run an NTP client or some equivalent means of keeping the system clock reasonably accurate. NTP clients are available for every platform we support (including Windows). If you want to see the UTC time setting, give the command: date -u cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Emacs local vars at the tail of every file
I don't know if this changes the calculus but apparently we've already decided to go down the route of having Emacs local variables attached to every file in the source directory. We have things like this there: !-- Keep this comment at the end of the file Local variables: mode: sgml sgml-omittag:nil sgml-shorttag:t sgml-minimize-attributes:nil sgml-always-quote-attributes:t sgml-indent-step:1 sgml-indent-data:t sgml-parent-document:nil sgml-default-dtd-file:../reference.ced sgml-exposed-tags:nil sgml-local-catalogs:/usr/lib/sgml/catalog sgml-local-ecat-files:nil End: -- -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Fix linking of OpenLDAP libraries
Albe Laurenz [EMAIL PROTECTED] writes: Tom Lane wrote: Applied, but without that last part. It builds OK for me on Darwin, which is moderately picky about that sort of thing, but someone should try AIX. It builds fine on AIX 5.3 as long as you tell it to link with libpq.so. Static builds against libpq.a will fail. Hm. We have been assuming that AIX's problem is that dynamic libraries don't remember their dependencies properly, but maybe the real issue is that it prefers static over dynamic libraries? If so, what we ought to be doing is adding the prefer-dynamic-libraries switch to the default LDFLAGS on that platform. Should -lldap or -lldap_r be added to pg_config --libs? You have a mistaken idea of the purpose of pg_config --libs. It exists to record what LIBS was at build time, not more, not less. It is certainly not intended as a guide to how to link libpq. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Buildfarm vs. Linux Distro classification
Lately there have been some buildfarm registrations for Debian testing/unstable or similarly described machines. I have kicked back against these, as the description seems to me to be far too open ended. Likewise, I also have difficulty with Gentoo because a version there seems to describe a build system but not any reasonably bounded set of software. The whole point of classifying buildfarm machines is so that we can pin down problems to some class of machines, so I'm not really sure what I should do in these cases that essentially represent highly mobile targets. Thoughts? cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Fix linking of OpenLDAP libraries
Martijn van Oosterhout kleptog@svana.org writes: Static links are going to require it on every platform, not just AIX. The question do we want to ask is how easy do we want to make static linking, because the same treatment will have to apply to -lssl, -lcrypto, -lkrb5, -lk5crypto and quite possibly others. Do we really want to go there? Well, we already have a solution for static linking within the PG build environment, the question is do we wish to export it. Given the lack of complaints to date, I think not. Do we really want to encourage anyone to statically link libraries that don't belong to their project? It's not only the build-time dependency hell, it's the difficulty of installing library updates. There's a reason why dynamic link is the default on all modern platforms. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Emacs local vars at the tail of every file
Gregory Stark wrote: I don't know if this changes the calculus but apparently we've already decided to go down the route of having Emacs local variables attached to every file in the source directory. We have things like this there: !-- Keep this comment at the end of the file Local variables: mode: sgml sgml-omittag:nil sgml-shorttag:t sgml-minimize-attributes:nil sgml-always-quote-attributes:t sgml-indent-step:1 sgml-indent-data:t sgml-parent-document:nil sgml-default-dtd-file:../reference.ced sgml-exposed-tags:nil sgml-local-catalogs:/usr/lib/sgml/catalog sgml-local-ecat-files:nil End: -- Only in the docs. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal for GUID datatype
Hi, Gevik, Gevik Babakhani wrote: typreceive = not supported typsend = not supported Any reason why you don't want to support binary transmissions? Thanks, Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Emacs local vars at the tail of every file
Andrew Dunstan [EMAIL PROTECTED] writes: Gregory Stark wrote: I don't know if this changes the calculus but apparently we've already decided to go down the route of having Emacs local variables attached to every file in the source directory. We have things like this there: !-- Keep this comment at the end of the file Local variables: Only in the docs. And indeed I'd argue for taking those *out* of the docs files, and adding them to the suggested emacs config-file hacking. A sufficient reason why is that I doubt these settings are uniformly valid across all emacs versions. We already know that the C-code indentation settings have changed, and this sgml configuration is far more detailed than those. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Buildfarm vs. Linux Distro classification
Andrew Dunstan wrote: Lately there have been some buildfarm registrations for Debian testing/unstable or similarly described machines. I have kicked back against these, as the description seems to me to be far too open ended. Likewise, I also have difficulty with Gentoo because a version there seems to describe a build system but not any reasonably bounded set of software. The whole point of classifying buildfarm machines is so that we can pin down problems to some class of machines, so I'm not really sure what I should do in these cases that essentially represent highly mobile targets. Thoughts? Well, the one downside to kicking these machines is that we won't get representation from the varying glibc/gcc combinations. It *almost* seems that we care more about: Archicteture 32/64bit (I could be running 32bit on my Athlon64) GCC version glibc version ? Sincerely, Joshua D. Drake cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Fixed length data types issue
Tom Lane [EMAIL PROTECTED] writes: Also Heikki points out here that it would be nice to allow for the case for a 0-byte header. I don't think there's enough code space for that; at least not compared to its use case. Well it's irrelevant if we add a special data type to handle CHAR(1). But if we don't it's pretty important. Even with 1-byte varlena headers you can have approaching 100% bloat if you have a table with lots of CHAR(1) fields. That said I'm not sure whether it's worth it over having a special CHAR(1) data type which would have the benefit of handling other 1-byte encodings aside from ascii. We would probably still need a CHAR(2) data type too where the overhead is still 50%. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Fix linking of OpenLDAP libraries
Tom Lane wrote: It builds fine on AIX 5.3 as long as you tell it to link with libpq.so. Static builds against libpq.a will fail. Hm. We have been assuming that AIX's problem is that dynamic libraries don't remember their dependencies properly, but maybe the real issue is that it prefers static over dynamic libraries? If so, what we ought to be doing is adding the prefer-dynamic-libraries switch to the default LDFLAGS on that platform. AIX shared libraries know their dependencies well, and you don't have to specify them again when linking against the library. Let me expand a little on some of the peculiarities of shared libraries on AIX: - A normal AIX shared library is called libXX.a It is an 'ar' archive that contains the shared object(s). - In static linking mode, the shared object will be included in the executable. In dynamic linking mode (the default) only a reference to the shared object is included. - AIX can also link against shared objects called libXX.so (like other UNIXen). - When it resolves a -lXX flag, the linker searches the -L list for libXX.a and libXX.so. It prefers libXX.a over libXX.so unless invoked with the flag -brtl. So the linker does not prefer static over dynamic libraries, it just perfers libXX.a over libXX.so. In our case, we have libpq.a and libpq.so in the same directory, so unless you link with -brtl you will get a static link (because libpq.a is a static library). To illustrate, let me include the output of 'ldd psql' for a dynamically linked PostgreSQL: 8.2/bin/psql needs: /postgres/8.2/lib/libpq.so /usr/lib/libpthread.a(shr_xpg5_64.o) /usr/lib/libc.a(shr_64.o) /usr/local/lib/libldap_r.a(libldap_r-2.3.so.0) /unix /usr/lib/libcrypt.a(shr_64.o) /usr/lib/libs.a(shr_64.o) See http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/c om.ibm.aix.cmds/doc/aixcmds3/ld.htm Should -brtl be added to src/template/aix? Should -lldap or -lldap_r be added to pg_config --libs? You have a mistaken idea of the purpose of pg_config --libs. It exists to record what LIBS was at build time, not more, not less. It is certainly not intended as a guide to how to link libpq. *nods* Thanks for the clarification. Yours, Laurenz Albe ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Fixed length data types issue
Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Also Heikki points out here that it would be nice to allow for the case for a 0-byte header. I don't think there's enough code space for that; at least not compared to its use case. Well it's irrelevant if we add a special data type to handle CHAR(1). In that case you should probably be using char ... -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Fix linking of OpenLDAP libraries
Albe Laurenz [EMAIL PROTECTED] writes: Let me expand a little on some of the peculiarities of shared libraries on AIX: - A normal AIX shared library is called libXX.a It is an 'ar' archive that contains the shared object(s). Ah, so the problem really boils down to funny naming conventions. If they use .a for both shared and static libraries, how does anyone tell the difference? So the linker does not prefer static over dynamic libraries, it just perfers libXX.a over libXX.so. In our case, we have libpq.a and libpq.so in the same directory, so unless you link with -brtl you will get a static link (because libpq.a is a static library). I wonder whether we ought to suppress building (or at least installing) our .a libraries at all on AIX. Adding -btrl to LDFLAGS would help within the context of our own build, but external clients that link to libpq without saying that are going to get undesirable results. I think there's a reasonable argument that by installing a .a file that isn't a shared library, we are violating the platform's conventions. Should -brtl be added to src/template/aix? Sounds that way, but that'll only help for psql and other stuff built within our build. Could you try this against CVS tip: * add -brtl to LDFLAGS in the template * Remove the AIX-specific hack on $(libpq) at lines 349-354 of src/Makefile.global.in * see if it configures and builds regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Fixed length data types issue
On Mon, Sep 11, 2006 at 03:13:36PM +0100, Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Also Heikki points out here that it would be nice to allow for the case for a 0-byte header. I don't think there's enough code space for that; at least not compared to its use case. Well it's irrelevant if we add a special data type to handle CHAR(1). We already have a CHAR(1), it's called char and it's exactly one byte. This discussion should probably be about strings longer than that. It's a pity arrays have so much overhead, otherwise you could work with arrays of char. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Proposal for GUID datatype
Jan de Visser wrote: On Friday 08 September 2006 15:18, Gevik Babakhani wrote: 2a) Three input formats are supported. example: insert into tbl (fld) values('1dfb39af-b56a-40b8-a903-b5b31567c3ce'); insert into tbl (fld) values('{1dfb39af-b56a-40b8-a903-b5b31567c3ce}'); insert into tbl (fld) values('1dfb39afb56a40b8a903b5b31567c3ce'); Please extend your list. java's RMI package generates UIDs with the following format: [head order 21:19]$ bsh.sh BeanShell 1.3.0 - by Pat Niemeyer ([EMAIL PROTECTED]) bsh % print(new java.rmi.server.UID()); 3b732da7:10d9029b3eb:-8000 bsh % So forms that use colons instead of dashes seem appropriate. Or better still, make it configurable. jan RMI UID's has nothing in common with UUID's so I fail to see how they have any relevance here. I've never seen any other representation of the UUID's that the two that Gevik wants to support. Please note that UUID is a very well known concept and not an arbitrary 128 bit storage. http://en.wikipedia.org/wiki/UUID is a good source of information. The appointed RFC actually contains source code. Kind Regards, Thomas Hallgren ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Proposal for GUID datatype
On Monday 11 September 2006 11:05, Thomas Hallgren wrote: Jan de Visser wrote: On Friday 08 September 2006 15:18, Gevik Babakhani wrote: 2a) Three input formats are supported. example: insert into tbl (fld) values('1dfb39af-b56a-40b8-a903-b5b31567c3ce'); insert into tbl (fld) values('{1dfb39af-b56a-40b8-a903-b5b31567c3ce}'); insert into tbl (fld) values('1dfb39afb56a40b8a903b5b31567c3ce'); Please extend your list. java's RMI package generates UIDs with the following format: [head order 21:19]$ bsh.sh BeanShell 1.3.0 - by Pat Niemeyer ([EMAIL PROTECTED]) bsh % print(new java.rmi.server.UID()); 3b732da7:10d9029b3eb:-8000 bsh % So forms that use colons instead of dashes seem appropriate. Or better still, make it configurable. jan RMI UID's has nothing in common with UUID's so I fail to see how they have any relevance here. I've never seen any other representation of the UUID's that the two that Gevik wants to support. Please note that UUID is a very well known concept and not an arbitrary 128 bit storage. http://en.wikipedia.org/wiki/UUID is a good source of information. The appointed RFC actually contains source code. OK, good info, and something I was missing/didn't know. Apparently there is a JDK1.5 class that does the right thing. Kind Regards, Thomas Hallgren Crawling back under my rock, jan -- -- Jan de Visser [EMAIL PROTECTED] Baruk Khazad! Khazad ai-menu! -- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] contrib/xml2 and PG_MODULE_MAGIC
Michael Fuhr [EMAIL PROTECTED] writes: contrib/xml2 currently has PG_MODULE_MAGIC in xslt_proc.c, which results in a runtime error on systems that built the module without support for libxslt per the comments in the Makefile. Should PG_MODULE_MAGIC be in xpath.c instead? [ examines xml2/Makefile ... ] Yeah, sure looks that way. Will fix. Even more interesting would be to fix things so that xml2 gets built as part of the regular contrib build, but I'm not sure if we're ready to add stuff to the configure script for the sole benefit of a contrib module. I'd be more willing to do it if there were a long-term plan for putting libxml-dependent code into the core. Do we have a roadmap yet for XML development? I thought there was a session at the anniversary conference devoted to sketching one, but if anything came out of that I missed it. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Lock partitions
Mark Wong [EMAIL PROTECTED] writes: Tom Lane wrote: It would be nice to see some results from the OSDL tests with, say, 4, 8, and 16 lock partitions before we forget about the point though. Anybody know whether OSDL is in a position to run tests for us? Yeah, I can run some dbt2 tests in the lab. I'll get started on it. We're still a little bit away from getting the automated testing for PostgreSQL going again though. Great, thanks. The thing to twiddle is LOG2_NUM_LOCK_PARTITIONS in src/include/storage/lwlock.h. You need a full backend recompile after changing it, but you shouldn't need to initdb, if that helps. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposal for GUID datatype
I forgot the mention that I did not want to support those two in my initial version. But yesterday I started to work on those anyway :) typreceive and typsend On Mon, 2006-09-11 at 15:58 +0200, Markus Schaber wrote: Hi, Gevik, Gevik Babakhani wrote: typreceive = not supported typsend = not supported Any reason why you don't want to support binary transmissions? Thanks, Markus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] help wanted (fmgr.c)
Thank you for your reply. I found my bug in the code which made the function behave strangely. On Mon, 2006-09-11 at 14:23 +0530, Abhijit Menon-Sen wrote: At 2006-09-11 10:25:22 +0200, [EMAIL PROTECTED] wrote: What are we testing to be NULL here? Do we expect str to changed at line 1715 No. (Read the comment just above the function.) The code is like this, starting from line 1703: if (str == NULL flinfo-fn_strict) return (Datum) 0; That is, if the function is declared strict, and the argument (str) is 0, just return NULL straightaway. Then it sets up the fcinfo and calls the function, and then: ... /* Should get null result if and only if str is NULL */ if (str == NULL) { if (!fcinfo.isnull) elog(ERROR, input function %u returned non-NULL, fcinfo.flinfo-fn_oid); } else { if (fcinfo.isnull) elog(ERROR, input function %u returned NULL, fcinfo.flinfo-fn_oid); } This says: If the argument is NULL and the input function didn't return a NULL, log an error; but if the argument is non-NULL and the function returned NULL, log this other error. (Note that a function would set fcinfo-isnull to indicate that it wants to return an SQL NULL, as explained in $DOC/plhandler.html) -- ams ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Fixed length data types issue
Alvaro Herrera [EMAIL PROTECTED] writes: Well it's irrelevant if we add a special data type to handle CHAR(1). In that case you should probably be using char ... Well char doesn't have quite the same semantics as CHAR(1). If that's the consensus though then I can work on either fixing char semantics to match CHAR(1) or adding a separate type instead. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] contrib/xml2 and PG_MODULE_MAGIC
Tom Lane wrote: Michael Fuhr [EMAIL PROTECTED] writes: contrib/xml2 currently has PG_MODULE_MAGIC in xslt_proc.c, which results in a runtime error on systems that built the module without support for libxslt per the comments in the Makefile. Should PG_MODULE_MAGIC be in xpath.c instead? [ examines xml2/Makefile ... ] Yeah, sure looks that way. Will fix. Even more interesting would be to fix things so that xml2 gets built as part of the regular contrib build, but I'm not sure if we're ready to add stuff to the configure script for the sole benefit of a contrib module. I'd be more willing to do it if there were a long-term plan for putting libxml-dependent code into the core. Do we have a roadmap yet for XML development? I thought there was a session at the anniversary conference devoted to sketching one, but if anything came out of that I missed it. Me too. Part of the trouble is that support for the standard will require support in the parser, so we need to decide if we want to go down that road. Personally I think we should, but it would make a significant change from what we have available now. Peter seemed a bit more ambivalent about it, though. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Lock partitions
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I see this in the CVS commits for 8.2. Did we determine the proper number of lock partitions? Should it be based on the number of buffers or concurrent sessions allowed? No. NUM_LOCK_PARTITIONS needs to be a compile-time constant for a number of reasons, and there is absolutely zero evidence to justify making any effort (and spending any cycles) on a variable value. It would be nice to see some results from the OSDL tests with, say, 4, 8, and 16 lock partitions before we forget about the point though. Anybody know whether OSDL is in a position to run tests for us? Yeah, I can run some dbt2 tests in the lab. I'll get started on it. We're still a little bit away from getting the automated testing for PostgreSQL going again though. Mark ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Fixed length data types issue
On Sun, 2006-09-10 at 21:16 -0400, Tom Lane wrote: After further thought I have an alternate proposal (snip) * If high order bit of datum's first byte is 0, then it's an uncompressed datum in what's essentially the same as our current in-memory format except that the 4-byte length word must be big-endian (to ensure that the leading bit can be kept zero). In particular this format will be aligned on 4- or 8-byte boundary as called for by the datatype definition. * If high order bit of first byte is 1, then it's some compressed variant. I'd propose divvying up the code space like this: * 0xxx uncompressed 4-byte length word as stated above * 10xx 1-byte length word, up to 62 bytes of data * 110x 2-byte length word, uncompressed inline data * 1110 2-byte length word, compressed inline data * 1-byte length word, out-of-line TOAST pointer This limits us to 8K uncompressed or 4K compressed inline data without toasting, which is slightly annoying but probably still an insignificant limitation. It also means more distinct cases for the heap_deform_tuple inner loop to think about, which might be a problem. Since the compressed forms would not be aligned to any boundary, there's an important special case here: how can heap_deform_tuple tell whether the next field is compressed or not? The answer is that we'll have to require pad bytes between fields to be zero. (They already are zeroed by heap_form_tuple, but now it'd be a requirement.) So the algorithm for decoding a non-null field is: * if looking at a byte with high bit 0, then we are either on the start of an uncompressed field, or on a pad byte before such a field. Advance to the declared alignment boundary for the datatype, read a 4-byte length word, and proceed. * if looking at a byte with high bit 1, then we are at the start of a compressed field (which will never have any preceding pad bytes). Decode length as per rules above. The good thing about this approach is that it requires zero changes to fundamental system structure. The pack/unpack rules in heap_form_tuple and heap_deform_tuple change a bit, and the mechanics of PG_DETOAST_DATUM change, but a Datum is still just a pointer and you can always tell what you've got by examining the pointed-to data. Seems like a great approach to this pain point. More fun than lots of new datatypes also. Is this an 8.2 thing? If not, is Numeric508 applied? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Fixed length data types issue
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I'm imagining that it would give you the same old uncompressed in-memory representation as it does now, ie, 4-byte length word and uncompressed data. Sure, but how would you know? Sometimes you would get a pointer to a varlena starting with a bytes with a leading 00 indicating a 1-byte varlena header and sometimes you would get a pointer to a varlena with the old uncompressed representation with a 4-byte length header which may well start with a 00. Yeah, in that scheme you need some out-of-band information telling you if the datum is compressed or not. The second scheme I posted avoids that problem. * If high order bit of first byte is 1, then it's some compressed variant. I'd propose divvying up the code space like this: * 0xxx uncompressed 4-byte length word as stated above * 10xx 1-byte length word, up to 62 bytes of data * 110x 2-byte length word, uncompressed inline data * 1110 2-byte length word, compressed inline data * 1-byte length word, out-of-line TOAST pointer I'm unclear how you're using the remaining bits. Length (or high order bits of it, if the length covers more than 1 byte). Also Heikki points out here that it would be nice to allow for the case for a 0-byte header. I don't think there's enough code space for that; at least not compared to its use case. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Fixed length data types issue
Gregory Stark wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Well it's irrelevant if we add a special data type to handle CHAR(1). In that case you should probably be using char ... Well char doesn't have quite the same semantics as CHAR(1). If that's the consensus though then I can work on either fixing char semantics to match CHAR(1) or adding a separate type instead. What semantics? I thought you would just store a byte there, retrieve it and compare to something else. Anything beyond this doesn't probably make much sense (to me anyway). Are you thinking in concatenating it, etc? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Buildfarm vs. Linux Distro classification
Andrew Dunstan wrote: Lately there have been some buildfarm registrations for Debian testing/unstable or similarly described machines. I have kicked back against these, as the description seems to me to be far too open ended. Then again, it would be useful to actually test on Debian testing/unstable (or pre-release branches of other OS), because that would (a) expose problems with new toolchains and such earlier than in released versions, and (b) provide advance testing for when testing becomes the release. Consider, the number of users that will run 8.2 on Debian stable is probably going to be less than the number of users who will run 8.2 on what today is testing. I agree that the lack of a fixed version designation is unsatisfactory. I'm not sure whether that is actually necessary, though. If PostgreSQL doesn't work on some machine, then that's a problem anyway. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Emacs local vars at the tail of every file
Gregory Stark wrote: I don't know if this changes the calculus but apparently we've already decided to go down the route of having Emacs local variables attached to every file in the source directory. We have things like this there: I delete them from every file I edit, but I haven't been interested enough to remove them all. Feel free to do the legwork. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Fixed length data types issue
Gregory Stark [EMAIL PROTECTED] writes: In any case it seems a bit backwards to me. Wouldn't it be better to preserve bits in the case of short length words where they're precious rather than long ones? If we make 0xxx the 1-byte case it means ... Well, I don't find that real persuasive: you're saying that it's important to have a 1-byte not 2-byte header for datums between 64 and 127 bytes long. Which is by definition less than a 2% savings for those values. I think its's more important to pick bitpatterns that reduce the number of cases heap_deform_tuple has to think about while decoding the length of a field --- every if in that inner loop is expensive. I realized this morning that if we are going to preserve the rule that 4-byte-header and compressed-header cases can be distinguished from the data alone, there is no reason to be very worried about whether the 2-byte cases can represent the maximal length of an in-line datum. If you want to do 16K inline (and your page is big enough for that) you can just fall back to the 4-byte-header case. So there's no real disadvantage if the 2-byte headers can only go up to 4K or so. This gives us some more flexibility in the bitpattern choices. Another thought that occurred to me is that if we preserve the convention that a length word's value includes itself, then for a 1-byte header the bit pattern 1000 is meaningless --- the count has to be at least 1. So one trick we could play is to take over this value as the signal for toast pointer follows, with the assumption that the tuple-decoder code knows a-priori how big a toast pointer is. I am not real enamored of this, because it certainly adds one case to the inner heap_deform_tuple loop and it'll give us problems if we ever want more than one kind of toast pointer. But it's a possibility. Anyway, a couple of encodings that I'm thinking about now involve limiting uncompressed data to 1G (same as now), so that we can play with the first 2 bits instead of just 1: 00xx4-byte length word, aligned, uncompressed data (up to 1G) 01xx4-byte length word, aligned, compressed data (up to 1G) 100x1-byte length word, unaligned, TOAST pointer 10102-byte length word, unaligned, uncompressed data (up to 4K) 10112-byte length word, unaligned, compressed data (up to 4K) 11xx1-byte length word, unaligned, uncompressed data (up to 63b) or 00xx4-byte length word, aligned, uncompressed data (up to 1G) 010x2-byte length word, unaligned, uncompressed data (up to 8K) 011x2-byte length word, unaligned, compressed data (up to 8K) 10001-byte length word, unaligned, TOAST pointer 1xxx1-byte length word, unaligned, uncompressed data (up to 127b) (xxx not all zero) This second choice allows longer datums in both the 1-byte and 2-byte header formats, but it hardwires the length of a TOAST pointer and requires four cases to be distinguished in the inner loop; the first choice only requires three cases, because TOAST pointer and 1-byte header can be handled by the same rule length is low 6 bits of byte. The second choice also loses the ability to store in-line compressed data above 8K, but that's probably an insignificant loss. There's more than one way to do it ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?
Hi! I'm about to do some benchmarking on -HEAD one some hardware I have available and it seems I'm hitting a rather weird issue causing the osdl dbt3 benchmark to run very slow and eating CPU time for hours ... it seems that the issue is caused by the following query: (in case it gets linewrapped: http://www.kaltenbrunner.cc/files/dbt3_with_hashjoin.txt) select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'AFRICA' and o_orderdate = date '1993-01-01' and o_orderdate date '1993-01-01' + interval '1 year' group by n_name order by revenue desc; that results in the following plan on my box: Sort (cost=2543391.75..2543391.81 rows=25 width=37) Sort Key: sum((lineitem.l_extendedprice * (1::double precision - lineitem.l_discount))) - HashAggregate (cost=2543390.73..2543391.17 rows=25 width=37) - Hash Join (cost=440864.81..2543027.40 rows=72666 width=37) Hash Cond: ((orders.o_custkey = customer.c_custkey) AND (supplier.s_nationkey = customer.c_nationkey)) - Hash Join (cost=377714.59..2415568.01 rows=1816643 width=49) Hash Cond: (lineitem.l_orderkey = orders.o_orderkey) - Nested Loop (cost=13.65..1719683.85 rows=12000672 width=49) - Merge Join (cost=0.00..10248.66 rows=2 width=41) Merge Cond: (nation.n_nationkey = supplier.s_nationkey) - Nested Loop (cost=0.00..19.19 rows=5 width=33) - Index Scan using pk_nation on nation (cost=0.00..9.38 rows=25 width=37) - Index Scan using pk_region on region (cost=0.00..0.38 rows=1 width=4) Index Cond: (nation.n_regionkey = region.r_regionkey) Filter: (r_name = 'AFRICA'::bpchar) - Index Scan using i_s_nationkey on supplier (cost=0.00..9779.46 rows=10 width=8) - Bitmap Heap Scan on lineitem (cost=13.65..77.16 rows=665 width=16) Recheck Cond: (lineitem.l_suppkey = supplier.s_suppkey) - Bitmap Index Scan on i_l_suppkey (cost=0.00..13.65 rows=665 width=0) Index Cond: (lineitem.l_suppkey = supplier.s_suppkey) - Hash (cost=372023.51..372023.51 rows=2270971 width=8) - Bitmap Heap Scan on orders (cost=41391.94..372023.51 rows=2270971 width=8) Recheck Cond: ((o_orderdate = '1993-01-01'::date) AND (o_orderdate '1994-01-01 00:00:00'::timestamp without time zone)) - Bitmap Index Scan on i_o_orderdate (cost=0.00..41391.94 rows=2270971 width=0) Index Cond: ((o_orderdate = '1993-01-01'::date) AND (o_orderdate '1994-01-01 00:00:00'::timestamp without time zone)) - Hash (cost=55647.15..55647.15 rows=1500615 width=8) - Seq Scan on customer (cost=0.00..55647.15 rows=1500615 width=8) (27 rows) so it really thinks that doing hashes with gigantic amounts of data is a good idea generally - this seems to be independent on work_mem - the plan looks the same with 1MB vs 126MB(which I had during the run). the profile of the backend eating the cpu looks similiar to: 2635127.9047 ExecScanHashBucket 8239 8.7248 hash_seq_search 6984 7.3958 hash_search_with_hash_value setting hash_join to off results in a runtime of about 2,5minutes: (http://www.kaltenbrunner.cc/files/dbt3_without_hashjoin.txt) Sort (cost=3700257.38..3700257.45 rows=25 width=37) (actual time=286820.962..286820.968 rows=5 loops=1) Sort Key: sum((lineitem.l_extendedprice * (1::double precision - lineitem.l_discount))) - HashAggregate (cost=3700256.37..3700256.80 rows=25 width=37) (actual time=286820.932..286820.941 rows=5 loops=1) - Nested Loop (cost=730956.43..3699893.04 rows=72666 width=37) (actual time=43551.767..286488.555 rows=72441 loops=1) Join Filter: (customer.c_nationkey = supplier.s_nationkey) - Merge Join (cost=730956.43..3624153.73 rows=1816643 width=49) (actual time=43281.710..257082.739 rows=1822547 loops=1) Merge Cond: (lineitem.l_orderkey = orders.o_orderkey) - Index Scan using i_l_orderkey on lineitem (cost=0.00..2715943.34 rows=60003360 width=16) (actual time=32.868..123668.380 rows=59991868 loops=1) - Sort (cost=730956.43..732091.92 rows=454194 width=41) (actual time=43248.797..45754.223
Re: [HACKERS] Fixed length data types issue
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: In any case it seems a bit backwards to me. Wouldn't it be better to preserve bits in the case of short length words where they're precious rather than long ones? If we make 0xxx the 1-byte case it means ... Well, I don't find that real persuasive: you're saying that it's important to have a 1-byte not 2-byte header for datums between 64 and 127 bytes long. Which is by definition less than a 2% savings for those values. Sure, but my thinking was that saving one byte on data between 64 and 127 bytes long is more important than saving two bytes on data between 4k and 8k or whatever the range was in that proposal. I think its's more important to pick bitpatterns that reduce the number of cases heap_deform_tuple has to think about while decoding the length of a field --- every if in that inner loop is expensive. I'll have to spend a few hours tomorrow becoming one with that section of code. I looked at it already and was surprised at how short it was already so I can understand what you mean. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] set_pglocale_pgservice considered harmful
Why in the world is set_pglocale_pgservice() located in src/port/path.c? I was just trying to find out why ecpg is failing on the Darwin buildfarm members since I modified the arrangements for pulling in extra libraries for AIX. The answer is that ecpglib needs path.c for last_dir_separator(), and path.c contains set_pglocale_pgservice(), which pulls in both exec.c and bindtextdomain/textdomain/gettext. IOW libintl gets pulled into ecpglib links for no reason whatsoever --- it certainly isn't being used by ecpglib itself. I think set_pglocale_pgservice should either be moved to exec.c or put into its own separate src/port/ file. Comments? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Buildfarm vs. Linux Distro classification
Peter Eisentraut wrote: Andrew Dunstan wrote: Lately there have been some buildfarm registrations for Debian testing/unstable or similarly described machines. I have kicked back against these, as the description seems to me to be far too open ended. Then again, it would be useful to actually test on Debian testing/unstable (or pre-release branches of other OS), because that would (a) expose problems with new toolchains and such earlier than in released versions, and (b) provide advance testing for when testing becomes the release. Consider, the number of users that will run 8.2 on Debian stable is probably going to be less than the number of users who will run 8.2 on what today is testing. I agree that the lack of a fixed version designation is unsatisfactory. I'm not sure whether that is actually necessary, though. If PostgreSQL doesn't work on some machine, then that's a problem anyway. well I think Andrew is more scared of having multiple boxes on the buildfarm all stating to be Debian testing or Debian unstable but without much information on how regulary those boxes are actually synced to those moving/changing branches and causing discussions on why is it suddenly failung on that box but not on the other. There might be quite a difference between a 2 month old testing and a recent one for example. However - we already have some precedence on the buildfarm for that (like the various -current BSD-members) Stefan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] New job
While I'm sitting here in New Jersey in a room with Bruce Momjian (aka Rock Star), I figured now would be a good time to announce my new employment. I'll be doing sales support/engineering from Austin. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDBhttp://enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Buildfarm vs. Linux Distro classification
Stefan Kaltenbrunner wrote: well I think Andrew is more scared of having multiple boxes on the buildfarm all stating to be Debian testing or Debian unstable but without much information on how regulary those boxes are actually synced to those moving/changing branches and causing discussions on why is it suddenly failung on that box but not on the other. It might make sense to attach the version information (in this case, gcc, glibc, kernel versions) to each particular test run, rather than to each particular farm member. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Buildfarm vs. Linux Distro classification
Alvaro Herrera wrote: Stefan Kaltenbrunner wrote: well I think Andrew is more scared of having multiple boxes on the buildfarm all stating to be Debian testing or Debian unstable but without much information on how regulary those boxes are actually synced to those moving/changing branches and causing discussions on why is it suddenly failung on that box but not on the other. It might make sense to attach the version information (in this case, gcc, glibc, kernel versions) to each particular test run, rather than to each particular farm member. that is a good idea generally but that information might not be available in a portable way on all platforms (and even than we might want to have version information on things/libs like readline,zlib,ldap, too) ... Stefan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Fixed length data types issue
Simon Riggs [EMAIL PROTECTED] writes: Is this an 8.2 thing? You are joking, no? If not, is Numeric508 applied? No, that got rejected as being too much of a restriction of the dynamic range, eg John's comment here: http://archives.postgresql.org/pgsql-general/2005-12/msg00246.php I think a more practical way of shaving 2 bytes from NUMERIC would be to invent a fixed-2-byte-header variant of varlena. That's something we talked about in the NUMERIC thread but ultimately dropped --- it'd be worth reconsidering along with the current ideas about multiple varlena header formats, however. The reason to not just make it use the generalized varlena format under discussion is that it'd really like to have at least 2-byte alignment; that would be enough to avoid memcpy-for-alignment. Another interesting subset of the problem is the inet/cidr datatypes, which I think would be perfectly happy with the 1-byte-header variants we've talked about --- AFAIR the inet code doesn't really have any alignment requirements on its data, and it certainly doesn't need values longer than 63 bytes. So that subset of the concept might need to be broken out as a separately usable thing too. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Buildfarm vs. Linux Distro classification
Peter Eisentraut [EMAIL PROTECTED] writes: I agree that the lack of a fixed version designation is unsatisfactory. I'm not sure whether that is actually necessary, though. If PostgreSQL doesn't work on some machine, then that's a problem anyway. The buildfarm script already seems to record various info such as uname output on-the-fly. If we could get it to record compiler version (gcc -v is easy, but equivalent incantations for vendor compilers might be harder to find) and a few other facts on-the-fly, I think the instability of the platforms might not be that big a deal. In practice, it seems that only Linux-based distros have bought into this idea that bleeding-edge tools are a good thing, so solutions that work only on Linux may be sufficient. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Fixed length data types issue
On Mon, Sep 11, 2006 at 01:15:43PM -0400, Tom Lane wrote: Gregory Stark [EMAIL PROTECTED] writes: In any case it seems a bit backwards to me. Wouldn't it be better to preserve bits in the case of short length words where they're precious rather than long ones? If we make 0xxx the 1-byte case it means ... Well, I don't find that real persuasive: you're saying that it's important to have a 1-byte not 2-byte header for datums between 64 and 127 bytes long. Which is by definition less than a 2% savings for those values. I think its's more important to pick bitpatterns that reduce the number of cases heap_deform_tuple has to think about while decoding the length of a field --- every if in that inner loop is expensive. I like your thought process on this, Tom. I read your suggestions and didn't respond because I was in full agreement with them. The 1-byte header would be valuable even if it only worked for 32-bytes. It is important to keep CPU overhead down by making it easy to switch off the bit patterns. Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: I'm about to do some benchmarking on -HEAD one some hardware I have available and it seems I'm hitting a rather weird issue causing the osdl dbt3 benchmark to run very slow and eating CPU time for hours ... Could we see the actual EXPLAIN ANALYZE results for the slow plan? I'm unconvinced by your hash join is bad analysis, especially in the cases where you're giving it lots of work_mem. I think it's got something to do with the different join orders. The rowcount estimates in the fast plan all seem pretty good, but I'm betting something is wrong with some of them in the slow case. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] contrib/xml2 and PG_MODULE_MAGIC
Tom Lane wrote: Even more interesting would be to fix things so that xml2 gets built as part of the regular contrib build, but I'm not sure if we're ready to add stuff to the configure script for the sole benefit of a contrib module. It might be good to get the configury code out in this release for advance testing. I'd be more willing to do it if there were a long-term plan for putting libxml-dependent code into the core. Do we have a roadmap yet for XML development? I thought there was a session at the anniversary conference devoted to sketching one, but if anything came out of that I missed it. I put a write-up here: http://developer.postgresql.org/index.php/XML_Support -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: I'm about to do some benchmarking on -HEAD one some hardware I have available and it seems I'm hitting a rather weird issue causing the osdl dbt3 benchmark to run very slow and eating CPU time for hours ... Could we see the actual EXPLAIN ANALYZE results for the slow plan? I'm unconvinced by your hash join is bad analysis, especially in the cases where you're giving it lots of work_mem. I think it's got something to do with the different join orders. The rowcount estimates in the fast plan all seem pretty good, but I'm betting something is wrong with some of them in the slow case. will do - but that will take a while - the slow one runs for 12h or so even without explain analyze overhead ... Stefan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] XML support wiki page
For those who don't read all the threads, I'll repeat it here. I've put up a wiki page working out the mysterious XML support: http://developer.postgresql.org/index.php/XML_Support This is pretty much my talk from the conference. The short status is that we have quite a bit of code ready and willing for 8.3. Some factions are working on sneaking some of that into 8.2, but not me. :) (It's a wiki, so please add stuff if you like. But keep the discussion here.) -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: btw - the hashjoin is bad was more or less based on the observation that nearly all of the cpu is burned in hash-related functions in the profile (when profiling over a longer period of time those accumulate even more % of the time than in the short profile I included in the original report) [ shrug... ] Two out of the three functions you mentioned are not used by hash join, and anyway the other plan probably has a comparable execution density in sort-related functions; does that make it bad? It's possible that the large time for ExecScanHashBucket has something to do with skewed usage of the hash buckets due to an unfortunate data distribution, but that's theorizing far in advance of the data. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Buildfarm vs. Linux Distro classification
Tom Lane wrote: The buildfarm script already seems to record various info such as uname output on-the-fly. If we could get it to record compiler version (gcc -v is easy, but equivalent incantations for vendor compilers might be harder to find) and a few other facts on-the-fly, I think the instability of the platforms might not be that big a deal. The configure script actually tries to get the compiler to put out a version number, and you can see the result of that in config.log. For other components, however, it seems impossible to get the version numbers across platforms. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Fix linking of OpenLDAP libraries
Tom Lane wrote: Here is a new patch that replaces the previous one; it adds two macros LDAP_LIBS_FE and LDAP_LIBS_BE for frontend and backend, respectively. I did not only add them to the Makefile for interfaces/libpq, but also everywhere something is linked against libpq in case somebody links static. Applied, but without that last part. It builds OK for me on Darwin, which is moderately picky about that sort of thing, but someone should try AIX. It builds fine on AIX 5.3 as long as you tell it to link with libpq.so. Static builds against libpq.a will fail. Should -lldap or -lldap_r be added to pg_config --libs? If yes, which of them? Yours, Laurenz Albe ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Fix linking of OpenLDAP libraries
On Mon, Sep 11, 2006 at 12:13:29PM +0200, Albe Laurenz wrote: Applied, but without that last part. It builds OK for me on Darwin, which is moderately picky about that sort of thing, but someone should try AIX. It builds fine on AIX 5.3 as long as you tell it to link with libpq.so. Static builds against libpq.a will fail. Should -lldap or -lldap_r be added to pg_config --libs? If yes, which of them? Static links are going to require it on every platform, not just AIX. The question do we want to ask is how easy do we want to make static linking, because the same treatment will have to apply to -lssl, -lcrypto, -lkrb5, -lk5crypto and quite possibly others. Do we really want to go there? Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [PATCHES] [HACKERS] Fix linking of OpenLDAP libraries
Here is a new patch that replaces the previous one; it adds two macros LDAP_LIBS_FE and LDAP_LIBS_BE for frontend and backend, respectively. I did not only add them to the Makefile for interfaces/libpq, but also everywhere something is linked against libpq in case somebody links static. Applied, but without that last part. It builds OK for me on Darwin, which is moderately picky about that sort of thing, but someone should try AIX. I'll do that today. Laurenz Albe ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Fixed length data types issue
Tom Lane [EMAIL PROTECTED] writes: No, that got rejected as being too much of a restriction of the dynamic range, eg John's comment here: http://archives.postgresql.org/pgsql-general/2005-12/msg00246.php That logic seems questionable. John makes two points: a) crypto applications are within a factor of two of the proposed limitation. Firstly, nobody does actual crypto work using Postgres's numeric data type. It would be ridiculously slow. They wouldn't even store numbers used for crypto in it, they would use bytea or something like that to store a binary bitstring. Secondly, there's nothing blocking us from changing it again in the future. It would make pg_upgrade a pain but solving user-defined datatypes being redefined would be a necessity anyways. A future version could always revert the change. b) Because we're usually not especially concerned with CPU usage of numeric we're also not concerned with space usage of numeric. I'm not sure what the arguments were that he's referring to but I have trouble imagining a credible argument against being concerned for cpu usage that wouldn't result in the conclusion that space usage was *more* important. I was actually going to suggest going back and looking for *more* space savings in numeric. I had assumed this first step had gone in long ago. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Fixed length data types issue
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: No, that got rejected as being too much of a restriction of the dynamic range, eg John's comment here: http://archives.postgresql.org/pgsql-general/2005-12/msg00246.php That logic seems questionable. John makes two points: a) crypto applications are within a factor of two of the proposed limitation. Firstly, nobody does actual crypto work using Postgres's numeric data type. It would be ridiculously slow. That's utterly irrelevant. The point is that there are standard applications today in which people need that much precision; therefore, the argument that 10^508 is far more than anyone could want is on exceedingly shaky ground. Besides, isn't it's too slow a bug we'd like to fix someday? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New job
Hello, On Mon, 2006-09-11 at 13:11 -0500, Jim C. Nasby wrote: While I'm sitting here in New Jersey in a room with Bruce Momjian (aka Rock Star), I figured now would be a good time to announce my new employment. I'll be doing sales support/engineering from Austin. Congrats Jim. Cheers, -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ signature.asc Description: This is a digitally signed message part
Re: [HACKERS] New job
Congrats Jim! - Luke Msg is shrt cuz m on ma treo -Original Message- From: Jim C. Nasby [mailto:[EMAIL PROTECTED] Sent: Monday, September 11, 2006 02:12 PM Eastern Standard Time To: pgsql-hackers@postgresql.org Subject:[HACKERS] New job While I'm sitting here in New Jersey in a room with Bruce Momjian (aka Rock Star), I figured now would be a good time to announce my new employment. I'll be doing sales support/engineering from Austin. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDBhttp://enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Fixed length data types issue
Tom Lane [EMAIL PROTECTED] writes: That's utterly irrelevant. The point is that there are standard applications today in which people need that much precision; therefore, the argument that 10^508 is far more than anyone could want is on exceedingly shaky ground. My point is those applications aren't practical in our current implementation and we can always extend the precision later if we decide we want it to be. Besides, isn't it's too slow a bug we'd like to fix someday? The only way I see to do that is to replace our implementation entirely with something like libgmp. At first I meant that as a reductio ad absurdum argument, but, uh, come to think of it why *do* we have our own arbitrary precision library? Is there any particular reason we can't use one of the existing binary implementations? I think libgmp itself is GPL'd but there are others and even if libgmp is GPL'd that just puts it into the same camp as readline. It would have to be an option and even the strictest interpretations of the GPL as long as there are alternative implementations it's fine. I was going to spend time looking at optimising numeric's storage but it seems like a waste of time if we could just use an implementation that's better. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Fix linking of OpenLDAP libraries
Tom Lane wrote: Albe Laurenz [EMAIL PROTECTED] writes: Let me expand a little on some of the peculiarities of shared libraries on AIX: - A normal AIX shared library is called libXX.a It is an 'ar' archive that contains the shared object(s). Ah, so the problem really boils down to funny naming conventions. If they use .a for both shared and static libraries, how does anyone tell the difference? It sounds to me like there is no difference. Notice how his example ldd output shows dependencies on specific .o entries within the various .a files that reside on the system, rather than on the .a files as a whole. If those entries had been statically linked then they wouldn't have shown up in the ldd output at all. So the difference is no longer between static libraries and shared libraries -- they're now just libraries. The only difference is how you link to them. What IBM has done here is very sensible, and is really what the other Unixes should have done from the beginning: a library is just a library, and what differs is how it's used. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Fixed length data types issue
Gregory Stark [EMAIL PROTECTED] writes: At first I meant that as a reductio ad absurdum argument, but, uh, come to think of it why *do* we have our own arbitrary precision library? Is there any particular reason we can't use one of the existing binary implementations? Going over to binary storage would trade off I/O speed for calculation speed, which is probably not a win for everyone; and even more seriously, how are you going to represent decimal fractions exactly? The fact that 0.01 is 0.01 and not just a near approximation thereto is critical for a lot of our users. I have no objection to relying on someone else's package if it actually solves our problem, but not if it just solves a related problem. (It might be interesting to offer a bignum datatype that uses binary math internally, but replacing numeric with it would be a hard sell.) regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Fixed length data types issue
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: At first I meant that as a reductio ad absurdum argument, but, uh, come to think of it why *do* we have our own arbitrary precision library? Is there any particular reason we can't use one of the existing binary implementations? Going over to binary storage would trade off I/O speed for calculation speed, which is probably not a win for everyone; Huh? Which would you expect binary to be worse at than decimal? I would expect it to be both faster and denser. and even more seriously, how are you going to represent decimal fractions exactly? The fact that 0.01 is 0.01 and not just a near approximation thereto is critical for a lot of our users. Certainly any arbitrary precision library isn't worth beans if it can't represent values accurately. I'm not sure how gmp and the others represent their data but my first guess is that there's no particular reason the base of the mantissa and exponent have to be the same as the base the exponent is interpreted as. That is, you can store a base 10 exponent but store it and the mantissa in two's complement integers. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Fixed length data types issue
On Mon, Sep 11, 2006 at 07:05:12PM -0400, Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: At first I meant that as a reductio ad absurdum argument, but, uh, come to think of it why *do* we have our own arbitrary precision library? Is there any particular reason we can't use one of the existing binary implementations? Going over to binary storage would trade off I/O speed for calculation speed, which is probably not a win for everyone; Huh? Which would you expect binary to be worse at than decimal? I would expect it to be both faster and denser. Representation is the difficult part. and even more seriously, how are you going to represent decimal fractions exactly? The fact that 0.01 is 0.01 and not just a near approximation thereto is critical for a lot of our users. Certainly any arbitrary precision library isn't worth beans if it can't represent values accurately. This isn't correct. Try representing 0.01 accurately in binary. See what you come up with. :-) I'm not sure how gmp and the others represent their data but my first guess is that there's no particular reason the base of the mantissa and exponent have to be the same as the base the exponent is interpreted as. That is, you can store a base 10 exponent but store it and the mantissa in two's complement integers. I don't think gmp does this, nor do I expect it would be trivial to author a package that was both efficient, and could operate in any base. I believe gmp operates in a base that is the size of the CPU word, usually 32-bits or 64-bits. It does not offer ability to calculate or store using base 10. I've seen libraries that do an acceptable job storing items in base 1000 or higher for use in decimal calculations. I have no idea what PostgreSQL itself does... :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] pgbench is badly broken since July
I have just realized that the recent patches in pgbench have altered its behavior in a way that destroys reproducibility of results --- I'm seeing reported TPS numbers about twice what they were before that. I'd love to say we did something in the past month that made the backend 2X faster, but sadly, tain't so :-( I've only done limited investigation but what seems to be happening is that the -s scale factor is no longer being applied to adjust the range of the random variables, leading to only part of the database being touched. For example, with -s = 10, the older code updates all 10 branches rows but CVS HEAD only seems to touch bid = 1. I see that the code still thinks it is scaling the numbers, but the mechanism for doing that has changed drastically, and I suspect that's the problem. I assume there's some garden-variety bug in there. At a more fundamental level, the recent patches seem to mean that pg_dump is doing noticeably more work to issue its queries than it was before, and I wonder whether this isn't going to destroy cross-version reproducibility in its own way. Have you done any timing studies to verify that the new and old code have comparable performance? I don't mind enhancing pgbench's functionality for non-default tests, but I think it's important that the default case remain comparable over time. If there is a noticeable speed difference then I'd vote for going back to the klugy old code for adjusting the default script's values. And at the nitpicking level, is :tps a good name for the variable that reflects the scaling factor? It seems awfully easy to confuse that with the TPS numbers that pgbench reports. Perhaps :scale or some such would be better. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pgbench is badly broken since July
I'll look into this. -- Tatsuo Ishii SRA OSS, Inc. Japan I have just realized that the recent patches in pgbench have altered its behavior in a way that destroys reproducibility of results --- I'm seeing reported TPS numbers about twice what they were before that. I'd love to say we did something in the past month that made the backend 2X faster, but sadly, tain't so :-( I've only done limited investigation but what seems to be happening is that the -s scale factor is no longer being applied to adjust the range of the random variables, leading to only part of the database being touched. For example, with -s = 10, the older code updates all 10 branches rows but CVS HEAD only seems to touch bid = 1. I see that the code still thinks it is scaling the numbers, but the mechanism for doing that has changed drastically, and I suspect that's the problem. I assume there's some garden-variety bug in there. At a more fundamental level, the recent patches seem to mean that pg_dump is doing noticeably more work to issue its queries than it was before, and I wonder whether this isn't going to destroy cross-version reproducibility in its own way. Have you done any timing studies to verify that the new and old code have comparable performance? I don't mind enhancing pgbench's functionality for non-default tests, but I think it's important that the default case remain comparable over time. If there is a noticeable speed difference then I'd vote for going back to the klugy old code for adjusting the default script's values. And at the nitpicking level, is :tps a good name for the variable that reflects the scaling factor? It seems awfully easy to confuse that with the TPS numbers that pgbench reports. Perhaps :scale or some such would be better. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Buildfarm vs. Linux Distro classification
Quoth [EMAIL PROTECTED] (Peter Eisentraut): Andrew Dunstan wrote: Lately there have been some buildfarm registrations for Debian testing/unstable or similarly described machines. I have kicked back against these, as the description seems to me to be far too open ended. Then again, it would be useful to actually test on Debian testing/unstable (or pre-release branches of other OS), because that would (a) expose problems with new toolchains and such earlier than in released versions, and (b) provide advance testing for when testing becomes the release. Consider, the number of users that will run 8.2 on Debian stable is probably going to be less than the number of users who will run 8.2 on what today is testing. I suppose I'm partly to blame for bringing this one up; I proposed doing a build on a box at home and at work (one IA-32, one AMD-64), both running something of a mix of Debian unstable/testing. I agree that the lack of a fixed version designation is unsatisfactory. I'm not sure whether that is actually necessary, though. If PostgreSQL doesn't work on some machine, then that's a problem anyway. Yeah. There are a somewhat limited number of sorts of problems that could cause this: - I could be running an icky, out of date set of libraries or such. In which case there's not much value in trying to fix things on the PostgreSQL side. - A change perhaps forthcoming in some future Debian release has busted something. Knowing that sooner is better than knowing that later... - Such a problem may be ephermal; it may be something I need to report to someone _else_ upstream. For instance, a bleeding-edge readline may have just drawn blood, and I need to let them know... It seems to me that there is some value in putting together a script that tries to identify some of the interesting bits of the toolchain. Here's a Mark 0 version: #!/bin/sh FINDGCCVERSION=`$CC --version` KERNELVERSION=`uname -a` ARCH=`arch` # Uncomment one of the following... #DEBIAN #LIBC=`dpkg -l libc6` #REDHAT/SuSE/Fedora #LIBC=`rpm -q libc6` #Slackware, etc #LIBC=`ls -l /lib/libc.so.*` I'll bet one could get more information pretty cheaply, including release names, and such, by recognizing the existence of distribution-specific files in /etc, such as /etc/debian_release, /etc/SuSE-release, /etc/fedora-release, /etc/redhat-release It's not necessarily vital to recognize every combination of anything that's plausible; if someone has an oddball system, reporting the details are their problem... -- output = (cbbrowne @ linuxfinances.info) http://linuxdatabases.info/info/x.html Natives who beat drums to drive off evil spirits are objects of scorn to smart Americans who blow horns to break up traffic jams. -- Unknown ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Buildfarm vs. Linux Distro classification
Christopher Browne wrote: It seems to me that there is some value in putting together a script that tries to identify some of the interesting bits of the toolchain. Yeah; but why not just a bunch of commands, some of which are expected to work on any particular machine, and save the whole output as a single string? It's not very clean, but should get the important details. To support a new machine, just add more commands to the script. A simple version of this, based on your Mark 0, could be: uname -a $CC --version $CC -V $CC -v ls -l /lib/libc.so* No need to comment/uncomment anything. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] Resurrecting per-page cleaner for btree
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: I've applied this but I'm now having some second thoughts about it, because I'm seeing an actual *decrease* in pgbench numbers from the immediately prior CVS HEAD code. The attached patch requires the new row to fit, and 10% to be free on the page. Would someone test that? At the moment, I cannot replicate any consistent difference between CVS head with the patch, without the patch, with the patch plus your BLCKSZ/10 limit addition, or with a variant BLCKSZ/32 limit addition. That's whether I use HEAD's broken version of pgbench or one from late July. So I'm feeling a tad frustrated ... but I have no evidence in favor of changing what is in CVS, and accordingly recommend that we leave well enough alone for 8.2. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org