[HACKERS] remapped localhost causes connections to localhost to fail using Postgres
Using a Windows computer, editing the file: C:\WINDOWS\system32\drivers\etc\hosts the localhost entry was remapped to the machine name by adding the following line: 127.0.0.1 After this change, Postgres would not allow access using the address localhost. Only using the machine name to access the database was possible. Is this by design? In other words psql -h localhost fails, but: psql -h succeeds.
[HACKERS] Re: [GENERAL] postgresql-9.3.1-1-windows-x64.exe does not install correctly for me
The PostgreSQL installer for Windows 64 appears to be broken for Microsoft Windows Server 2012 Standard. Even after uninstalling, removing the entire postgresql directory structure, and running the installer as administrator, I get this error: fixing permissions on existing directory C:/Program Files/PostgreSQL/9.3/data ... initdb: could not change permissions of directory C:/Program Files/PostgreSQL/9.3/data: Permission denied I was able to get it to install by giving permissions to the postgresql folder to applications and services, but I am not really sure what allowed it to complete. However, that sort of installation would be unsafe for regular users. I only use it for testing, so it does not really matter much in my case (though I would like it much better if I did not have to munge around to make the install work). -Original Message- From: Adrian Klaver [mailto:adrian.kla...@gmail.com] Sent: Thursday, October 31, 2013 5:03 PM To: Dann Corbit; 'pgsql-gene...@postgresql.org' Subject: Re: [GENERAL] postgresql-9.3.1-1-windows-x64.exe does not install correctly for me On 10/31/2013 11:53 AM, Dann Corbit wrote: postgresql-9.3.1-1-windows-x64.exe Problem running post-install step. Installation may not complete correctly The database cluster initialization failed. Is there an installation log I can examine to determine the problem more completely? http://www.enterprisedb.com/resources-community/pginst-guide#troubleshooting -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New regression test time
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Josh Berkus Sent: Saturday, June 29, 2013 3:00 PM To: Andrew Dunstan Cc: Alvaro Herrera; pgsql-hackers@postgresql.org; Robins Tharakan Subject: Re: [HACKERS] New regression test time On 06/29/2013 02:14 PM, Andrew Dunstan wrote: AIUI: They do test feature use and errors that have cropped up in the past that we need to beware of. They don't test every bug we've ever had, nor do they exercise every piece of code. If we don't have a test for it, then we can break it in the future and not know we've broken it until .0 is released. Is that really a direction we're happy going in? Maybe there is a good case for these last two in a different set of tests. If we had a different set of tests, that would be a valid argument. But we don't, so it's not. And nobody has offered to write a feature to split our tests either. I have to say, I'm really surprised at the level of resistance people on this list are showing to the idea of increasing test coverage. I thought that Postgres was all about reliability? For a project as mature as we are, our test coverage is abysmal, and I think I'm starting to see why. An ounce of prevention is worth a pound of cure. The cost of a bug rises exponentially, starting at requirements gathering and ending at the customer. Where I work, we have two computer rooms full of machines that run tests around the clock, 24x365. Even so, a full regression takes well over a week because we perform hundreds of thousands of tests. All choices of this kind are trade-offs. But in such situations, my motto is: Do whatever will make the customer prosper the most. IMO-YMMV -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ancient sequence point bug
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane Sent: Tuesday, April 16, 2013 7:52 PM To: Peter Eisentraut Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] ancient sequence point bug Peter Eisentraut pete...@gmx.net writes: This code in bootstrap.c contains a sequence point violation (or whatever that is really called): while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL) { (*app)-am_oid = HeapTupleGetOid(tup); memmove((char *) (*app++)-am_typ, (char *) GETSTRUCT(tup), sizeof((*app)-am_typ)); } What exactly is the violation? sizeof() is a side-effect-free compile time constant, and the first value to be passed to memmove seems perfectly well defined. Unless it is C99 and the object is a VLA, which must be evaluated at runtime. I guess that (*app)-am_typ is not a VLA, especially since PostgreSQL does not require C99 to compile. I grant that this is not terribly good coding style, but I don't believe there's an actual risk here. I agree that there is no risk in the sizeof operator. According to my understanding, even this is legal: unsigned long long *p = 0; size_t stupid = sizeof (*p); printf(size of a long long is %u\n, (unsigned) stupid); But I also guess that most compilers will have a cow when scanning it. In commit 1aebc361, another place in the same file was fixed like this: Well, I don't really remember why I did that twelve years ago, but seeing that there are a number of purely-cosmetic changes in that commit, I'm thinking it was only meant to be cosmetic. I certainly have no objection to making this code look more like that code, I'm just not seeing that it's a bug. You are right. It's not a bug. However, I would not be surprised if GCC or CLANG would shriek at the higher warning levels, usually not being able to apply artificial intelligence to solve problems that seem simple to humans. In the interest of quiet compiles equivalent code that does not produce a warning seems like a good idea. IMO-YMMV. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Why do we still perform a check for pre-sorted input within qsort variants?
Original Message- From: gsst...@gmail.com [mailto:gsst...@gmail.com] On Behalf Of Greg Stark Sent: Friday, March 08, 2013 4:59 PM To: Dann Corbit Cc: Bruce Momjian; Peter Geoghegan; Robert Haas; Tom Lane; PG Hackers Subject: Re: Why do we still perform a check for pre-sorted input within qsort variants? On Fri, Mar 8, 2013 at 7:43 PM, Dann Corbit dcor...@connx.com wrote: Checking for pre-sorted input will not make the routine faster on average. However, it prevents a common perverse case where the runtime goes quadratic, so sorting 10^6 elements will take K*10^12th operations when the bad condition does occur. Checking for pre-sorted data can be thought of as an insurance policy. It's kind of a pain to pay the premiums, but you sure are glad you have it when an accident occurs. Because the check is linear, and the algorithm is O(n*log(n)), the cost is not terribly significant. Well pre-sorted inputs are not the only quadratic case. If we really wanted to eliminate quadratic cases we could implement the pivot choice algorithm that guarantees n*log(n) worst-case. There are three things that give qsort fits: 1. Large ascending partitions. 2. Large descending partitions. 3. Bad choice of the median. If you examine the link in the previous article I gave: http://www.cs.toronto.edu/~zhouqq/postgresql/sort/sort.html It has a subfolder located by following this link: http://www.cs.toronto.edu/~zhouqq/postgresql/sort/dann/ In that folder is a file called qsortpdq.c In file qsortpdq.c is a function: void qsortPDQ(void *a, size_t n, size_t es, int (*cmp) (const void *, const void *)); which is an interface that checks for ascending partitions, descending partitions, and does a median of 3 choice for the pivot. Now, even this routine will have very rare inputs that can cause it to go quadratic. The probability of one of these particular inputs is very low. If you want 100% guarantee against quadratic behavior, then qsortpdq can be wrapped in heapsort with a recursion depth check. That sort would never go quadratic. That method is called introspective sort. Here are some links that describe introspective sort: http://en.wikipedia.org/wiki/Introsort https://secweb.cs.odu.edu/~zeil/cs361/web/website/Lectures/heapsort/pages/introspect.html And this is the real thing, right from the horse's mouth: http://www.cs.rpi.edu/~musser/gp/introsort.ps There is no such thing as a quicksort that never goes quadratic. It was formally proven. I cannot find the proof that I once read, but the article A Killer Adversary for Quicksort by M. D. MCILROY answers pretty nearly. Here is the summary from that paper: SUMMARY Quicksort can be made to go quadratic by constructing input on the fly in response to the sequence of items compared. The technique is illustrated by a specific adversary for the standard C qsort function. The generalmethod works against any implementation of quicksort-even a randomizing one-that satisfies certain very mild and realistic assumptions. But that will increase the average run-time. If we're not doing that then I think your whole argument falls apart. We do care about the average case as well as the worst-case. The makefile in folder: http://www.cs.toronto.edu/~zhouqq/postgresql/sort/dann/ will build the sort system and test it. The makefile in folder: http://www.cs.toronto.edu/~zhouqq/postgresql/sort/sort.html will build the simpler sort system and test it. I would suggest, in addition, the use of this routine to create a tough data set for robustness: http://www.cs.dartmouth.edu/~doug/aqsort.c There's been a *ton* of research on sorting. I find it hard to believe there isn't a pretty solid consensus on which of these defense mechanisms is the best trade-off. I'm partial to chapter 13 of C Unleashed in that regard. As far as the trade-offs to, the problem is that they really are trade-offs. However, the cost is quite small (benchmark and see) and the risk of not performing the checks is enormous. Mostly sorted data happens all the time in real life, as do other perverse distributions that cause problems for sorting algorithms.to believe there isn't a pretty solid consensus on which of these defense mechanisms is the best trade-off. My own personal recommendation would be to include every single safeguard (all three data checks and a recursion depth check as well). That's what we do here (I work at a database tools company). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Why do we still perform a check for pre-sorted input within qsort variants?
-Original Message- From: gsst...@gmail.com [mailto:gsst...@gmail.com] On Behalf Of Greg Stark Sent: Saturday, March 09, 2013 11:39 AM To: Dann Corbit Cc: Bruce Momjian; Peter Geoghegan; Robert Haas; Tom Lane; PG Hackers Subject: Re: Why do we still perform a check for pre-sorted input within qsort variants? On Sat, Mar 9, 2013 at 10:32 AM, Dann Corbit dcor...@connx.com wrote: There is no such thing as a quicksort that never goes quadratic. It was formally proven The median of medians selection of the pivot gives you O(n*log(n)). No. It does make O(n*n) far less probable, but it does not eliminate it. If it were possible, then introspective sort would be totally without purpose. And yet introspective sort is the algorithm of choice for the ANSI/ISO C++ standard directly because it will never go quadratic. Bentley and Mcilroy's Engineering a Sort Function says this about their final qsort source code model chosen for their implementation in the footnote: Of course, quadratic behavior is still possible. One can generate fiendish inputs by bugging Quicksort: Consider key values to be unknown initially. In the code for selecting a partition element, assign values in increasing order as unknown keys are encountered. In the partitioning code, make unknown keys compare high. Interestingly, some standard library qsort routines will go quadratic if simply given a set that contains random 0 and 1 values for the input set. It has been formally proven that no matter what method used to choose the median (other than calculating the exact median of every partition which would make quicksort slower than heapsort) there exists an adversary distribution that makes quicksort go quadratic. (unfortunately, I can't find the proof or I would give you the citation). Median selection that is randomized does not select the true median unless it operates over all the elements of the entire partition. With some small probability, it makes the worst possible choice for the median. It also does not matter if you choose the first, last and middle elements for a median of three (or other evenly selected points for larger sample sets) or if you select the sample with randomization. Both methods have adversaries. Now, the quickselect algorithm can select the median in O(n) but that is again an average. There are also adversary distributions for quickselect. On the other hand, various safeguards can make O(n*n) extremely improbable {even probabilities approaching zero}. There is a tradeoff with more and more complicated safeguards. If I choose large subsets of a partition and calculate the true median of the subset, it will make the routine safer from quadratic behavior but it will also make it slower. At some point, you end up with a routine no faster than heapsort, while also being much more complex and therefore more difficult to maintain. Hence, the necessity of introspective sort. On the other hand, there are also data specific sorts that have very special properties. There are cache conscious versions of burstsort that can sort strings much faster than quicksort or even radix sort according to measurements. There is a special version of LSD radix sort that will sort an array in one counting pass and N distribution passes, where N is the radix. So, for instance, if you are sorting 4 byte integers and your radix is 256 {one byte} then you can sort in 5 passes. One pass to count the bytes in each integer by position, and 4 passes to distribute the data. This can also be done in place. MSD radix sort can also be used as an outer sort container which sorts by distribution until the partitions are small enough and then switches to introspective sort (which eventually switches to insertion sort). Radix sorts can also be made totally generic via callback routines like quicksort. Instead of a comparison operator, the callback gets request for the radix bits for a specific bin number and then returns the radix count bits for that specific bin. For unsigned char and radix 256, this is trivially character [i] from the string for bin [i], for example. I guess improving sort routines all boils down to how much energy you want to put into it. Donald Knuth once stated that according to measurement, about 40% of business related mainframe computer cycles are involved in ordering data. So it is clearly an important problem. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Why do we still perform a check for pre-sorted input within qsort variants?
A Machine-Checked Proof of the Average-Case Complexity of Quicksort in Coq By Eelis van der Weegen and James McKinna Institute for Computing and Information Sciences Radboud University Nijmegen Heijendaalseweg 135, 6525 AJ Nijmegen, The Netherlands Contains a formal proof, validated by machine logic, that quicksort is quadratic in the worst case and also a formal proof of the average runtime efficiency. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Why do we still perform a check for pre-sorted input within qsort variants?
Yes, you are right. I knew of a median of medians technique for pivot selection and I mistook that for the median of medians median selection algorithm (which it definitely isn't). I was not aware of a true linear time selection of the median algorithm {which is what median of medians accomplishes). The fastest median selection algorithm that I was aware of was quickselect, which is only linear on average. I think that you analysis is correct, at any rate. I also think I will enjoy learning and experimenting with the median of medians algorithm. I found a link about it here: http://en.wikipedia.org/wiki/Selection_algorithm#Linear_general_selection_algorithm_-_Median_of_Medians_algorithm -Original Message- From: gsst...@gmail.com [mailto:gsst...@gmail.com] On Behalf Of Greg Stark Sent: Saturday, March 09, 2013 1:21 PM To: Dann Corbit Cc: Bruce Momjian; Peter Geoghegan; Robert Haas; Tom Lane; PG Hackers Subject: Re: Why do we still perform a check for pre-sorted input within qsort variants? On Sat, Mar 9, 2013 at 8:52 PM, Dann Corbit dcor...@connx.com wrote: Median of medians selection of the pivot gives you O(n*log(n)). No. It does make O(n*n) far less probable, but it does not eliminate it. If it were possible, then introspective sort would be totally without purpose. No really, quicksort with median of medians pivot selection is most definitely O(n*log(n)) worst case. This is textbook stuff. In fact even the introspective sort paper mentions it as one of the options to fail over to if the partition size isn't decreasing rapidly enough. The problem is that median of medians is O(n) rather than O(1). That doesn't change the O() growth rate since there will be log(n) iterations. But it means it contributes to the constant factor and the end result ends up being a constant factor larger than heap sort or merge sort. That also explains how your reference on the quicksort adversary doesn't apply. It works by ordering elements you haven't compared yet and assumes that all but O(1) elements will still be eligible for reordering. In any case I think you're basically right. What we have is basically a broken introspective sort that does more work than necessary and then handles fewer cases than it should. Implementing a better introspection that detects all perverse cases and does so with a lower overhead than the current check is a fine idea. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Why do we still perform a check for pre-sorted input within qsort variants?
-Original Message- From: gsst...@gmail.com [mailto:gsst...@gmail.com] On Behalf Of Greg Stark Sent: Saturday, March 09, 2013 5:16 PM To: Dann Corbit Cc: Bruce Momjian; Peter Geoghegan; Robert Haas; Tom Lane; PG Hackers Subject: Re: Why do we still perform a check for pre-sorted input within qsort variants? On Sat, Mar 9, 2013 at 10:22 PM, Dann Corbit dcor...@connx.com wrote: Yes, you are right. I knew of a median of medians technique for pivot selection and I mistook that for the median of medians median selection algorithm (which it definitely isn't). I was not aware of a true linear time selection of the median algorithm {which is what median of medians accomplishes). The fastest median selection algorithm that I was aware of was quickselect, which is only linear on average. I think that you analysis is correct, at any rate. Hm, I was using the terminology differently than the Wikipedia page. I was referring to the recursive median of 5s used as the pivot selection as median of medians. And I still called Quicksort or Quickselect using that pivot Quicksort or Quickselect with that specific pivot choice algorithm. When using that pivot choice Quicksort is O(n*log(n)) and Quickselect (Median of Medians on Wikipedia) is O(n). But the constant factor becomes larger than if the pivot choice algorithm is O(1). I suppose it's more interesting in the case of Quickselect since there's no other alternative algorithms that could be used that have better constant factors whereas for sorting we have other options. I wonder if it makes sense to use timsort as the fallback for quicksort if the partition sizes are skewed. Timsort is specifically designed to handle presorted inputs well. On the other hand it is itself a hybrid sort so it might be getting overly complex to make it part of a hybrid algorithm. -- My opinion (and it is no better than anyone else's) is that for a database you have to be very defensive in programming. Since database systems can contain any possible distribution (and over time they likely will encounter almost every possibility) it is important to prevent unusual inputs from causing disaster. The reason we added introspection to the sort here is that we already had quicksort with ordered partition check along with a median of three sample from three medians of three (not to mention the standard recursion of the smallest partition first and switch to insertion at small enough partition size). Sure enough, there was a customer who had a data distribution that caused bad behavior. We have customers with mainframe data where a single table can be 24 GB (I know this offhand, there are sure to be some that are larger), so a bad behavior on a sort *will* be noticed. Sorry about the oddball quoting. I will look at fixing it so that my posts are not so hard to grok. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants?
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Bruce Momjian Sent: Friday, March 08, 2013 11:22 AM To: Peter Geoghegan Cc: Robert Haas; Tom Lane; PG Hackers Subject: Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants? On Mon, Feb 25, 2013 at 02:31:21PM +, Peter Geoghegan wrote: On 25 February 2013 11:49, Robert Haas robertmh...@gmail.com wrote: I did attempt to do some tinkering with this while I was playing with it, but I didn't come up with anything really compelling. You can reduce the number of comparisons on particular workloads by tinkering with the algorithm, but then somebody else ends up doing more comparisons, so it's hard to say whether you've really made things better. Or at least I found it so. Right. To be honest, the real reason that it bothers me is that everything else that our qsort routine does that differs from classic quicksort (mostly quadratic insurance, like the median-of-medians pivot selection, but also the fallback to insertion sort when n 7) is very well supported by peer reviewed research. Like Tom, I find it implausible that Sedgewick and others missed a trick, where we did not, particularly with something so simple. Perhaps we are more likely to be fed sorted data than a typical qsort usecase. Checking for pre-sorted input will not make the routine faster on average. However, it prevents a common perverse case where the runtime goes quadratic, so sorting 10^6 elements will take K*10^12th operations when the bad condition does occur. Checking for pre-sorted data can be thought of as an insurance policy. It's kind of a pain to pay the premiums, but you sure are glad you have it when an accident occurs. Because the check is linear, and the algorithm is O(n*log(n)), the cost is not terribly significant. Switching to insertion sort for small partitions is almost always a good idea. This idea was invented by Richard Singleton as ACM algorithm 347. A different sort of safeguard, as compared to checking for pre-ordered data, is to watch recursion depth. If we find that we have already gone past a depth of log2(n) levels and we are not ready to shift to insertion sort, then there is some sort of perverse data set. A traditional fix is to switch to heap sort at this point. Since heap sort is also O(n*log(n)) {though the constant multiplier is larger than for quicksort on average} you never get O(n*n) behavior. This method is called introspective sort. There are, of course, other clever things that can be tried. The relaxed weak heapsort of Edelkamp and Steigler, for instance, or using tries as in Cache-Efficient String Sorting Using Copying by Ranjan Sinha. There is also possibly significant benefit to using radix sort for fixed width data that can be easily binned. I seem to recall that a year or two back some study was done on quicksort methodology as used in PostgreSQL. As I recall, the algorithm used in PostgreSQL fared well in the tests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants?
-Original Message- From: Peter Geoghegan [mailto:peter.geoghega...@gmail.com] Sent: Friday, March 08, 2013 12:00 PM To: Bruce Momjian Cc: Dann Corbit; Robert Haas; Tom Lane; PG Hackers Subject: Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants? On 8 March 2013 11:48, Bruce Momjian br...@momjian.us wrote: On Fri, Mar 8, 2013 at 07:43:10PM +, Dann Corbit wrote: I seem to recall that a year or two back some study was done on quicksort methodology as used in PostgreSQL. As I recall, the algorithm used in PostgreSQL fared well in the tests. Well, that's good to hear. I wouldn't mind taking a look at that. We're not the only ones that use (more or less) that same algorithm. I noticed that Redis does so too, just for example (though they didn't get wise to the problems with the swap_cnt pessimisation). Turns out, it was a lot longer ago than I thought (2005!). The thread was called Which qsort is used, but I also seem to recall that the topic was revisited a few times. See, for instance: http://www.postgresql.org/message-id/pine.lnx.4.58.0512121138080.18...@eon.cs http://www.cs.toronto.edu/~zhouqq/postgresql/sort/sort.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [help] Is it possible to support remote COPY operation on PG?
You can use the libpq API: http://www.postgresql.org/docs/9.2/interactive/libpq-copy.html The Postgresql JDBC driver exposes COPY, IIRC. From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Xiong He Sent: Tuesday, October 23, 2012 11:55 PM To: PostgreSQL-development Subject: [HACKERS] [help] Is it possible to support remote COPY operation on PG? Is it possible to copy some table data from remote client to the PG database server directly without upload the data file to the server side in advance? --- ThanksRegards, Xiong He
Re: [HACKERS] CUDA Sorting
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Gaetano Mendola Sent: Wednesday, February 15, 2012 2:54 PM To: Peter Geoghegan; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] CUDA Sorting On 15/02/2012 23:11, Peter Geoghegan wrote: On 15 February 2012 20:00, Gaetano Mendolamend...@gmail.com wrote: On 13/02/2012 19:48, Greg Stark wrote: I don't think we should be looking at either CUDA or OpenCL directly. We should be looking for a generic library that can target either and is well maintained and actively developed. Any GPU code we write ourselves would rapidly be overtaken by changes in the hardware and innovations in parallel algorithms. If we find a library that provides a sorting api and adapt our code to use it then we'll get the benefits of any new hardware feature as the library adds support for them. I think one option is to make the sort function pluggable with a shared library/dll. I see several benefits from this: - It could be in the interest of the hardware vendor to provide the most powerful sort implementation (I'm sure for example that TBB sort implementation is faster that pg_sort) - It can permit people to play with it without being deep involved in pg development and stuffs. Sorry, but I find it really hard to believe that the non-availability of pluggable sorting is what's holding people back here. Some vanguard needs to go and prove the idea by building a rough prototype before we can even really comment on what an API should look like. For example, I am given to understand that GPUs generally sort using radix sort - resolving the impedance mismatch that prevents someone from using a non-comparison based sort sure sounds like a lot of work for an entirely speculative reward. AFAIK thrust library uses the radix sort if the keys you are sorting are POD data comparable with a operator otherwise it does the comparison based sort using the operator provided. http://docs.thrust.googlecode.com/hg/modules.html I'm not saying that the non-availability of pluggable sort completely holds people back, I'm saying that it will simplify the process now and int the future, of course that's my opinion. Someone who cannot understand tuplesort, which is not all that complicated, has no business trying to build GPU sorting into Postgres. That sounds a bit harsh. I'm one of those indeed, I haven't look in the details not having enough time for it. At work we do GPU computing (not the sort type stuff) and given the fact I'm a Postgres enthusiast I asked my self: my server is able to sort around 500 milions integer per seconds, if postgres was able to do that as well it would be very nice. What I have to say? Sorry for my thoughts. I had a patch committed a few hours ago that almost included the capability of assigning an alternative sorting function, but only one with the exact same signature as my variant of qsort_arg. pg_qsort isn't used to sort tuples at all, by the way. Then I did look in the wrong direction. Thank you for point that out. Threading building blocks is not going to form the basis of any novel sorting implementation, because comparators in general are not thread safe, and it isn't available on all the platforms we support, and because of how longjmp interacts with C++ stack unwinding and so on and so on. Now, you could introduce some kind of parallelism into sorting integers and floats, but that's an awful lot of work for a marginal reward. The TBB was just example that did come in my mind. What do you mean with you could introduce some kind of parallelism? As far as I know any algorithm using the divide and conquer can be parallelized. Radix sorting can be used for any data type, if you create a callback that provides the most significant bits in width buckets. At any rate, I can't imagine why anyone would want to complain about sorting 40 times faster than before, considering the amount of time database spend in ordering data. I have a Cuda card in this machine (NVIDIA GeForce GTX 460) and I would not mind it a bit if my database ORDER BY clause suddenly started running ten times faster than before when I am dealing with a huge volume of data. There have been other experiments along these lines such as: GPU-based Sorting in PostgreSQL Naju Mancheril, School of Computer Science - Carnegie Mellon University www.cs.virginia.edu/~skadron/Papers/bakkum_sqlite_gpgpu10.pdf (This is for SQLite, but the grammar of SQLite is almost a pure subset of PostgreSQL, including things like vacuum...) http://wiki.postgresql.org/images/6/65/Pgopencl.pdf http://dl.acm.org/citation.cfm?id=1807207 http://www.scribd.com/doc/51484335/PostgreSQL-OpenCL-Procedural-Language-pgEast-March-2011 See also http://highscalability.com/scaling-postgresql-using-cuda -- Sent via pgsql-hackers mailing list
Re: [HACKERS] What Would You Like To Do?
From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Nolan Sent: Tuesday, September 13, 2011 11:51 AM To: Joshua D. Drake Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] What Would You Like To Do? On Tue, Sep 13, 2011 at 12:26 PM, Joshua D. Drake j...@commandprompt.commailto:j...@commandprompt.com wrote: On 09/13/2011 10:13 AM, Michael Nolan wrote: The lists all seem to be focusing on the things that the developers would like to add to PostgreSQL, what about some things that users or ISPs might like to have, and thus perhaps something that companies might actually see as worth funding? Well just my own two cents ... but it all depends on who is doing the funding. At this point 80% of the work CMD codes for Pg (or tertiary projects and modules) is funded by companies. So let's not assume that companies aren't funding things. They are. But perhaps if a few 'commercial' features were on the wish list there would be more companies willing to fund development? The developers get a bit of what they want to work on, the production users get a bit of what they need, everybody's happy. For example: A fully integrated ability to query across multiple databases,possibly on multiple servers, something Oracle has had for nearly two decades. That isn't the approach to take. The fact that Oracle has it is not a guarantee that it is useful or good. If you need to query across databases (assuming within the same cluster) then you designed your database wrong and should have used our SCHEMA support (what Oracle calls Namespaces) instead. This is the difference between developers and real world users. Real world users may not have the ability, time or resources to redesign their databases just because that's the 'best' way to do something. Will it be the most efficient way to do it? Almost certainly not. I've been involved in a few corporate mergers, and there was a short term need to do queries on the combined databases while the tiger team handling the IT restructuring figured out how (or whether) to merge the dabases together. (One of these happened to be an Oracle/Oracle situation, it was a piece of cake even though the two data centers were 750 miles apart and the table structures had almost nothing in common. Another was a two week headache, the third was even worse!) In a perfect world, it would be nice if one could do combined queries linking a PostgreSQL database with an Oracle one, or a MySQL one, too. Because sometimes, that's what you gotta do. Even something that is several hundred times slower is going to be faster than merging the databases together. When I do this today, I have to write a program (in perl or php) that accesses both databases and merges it by hand. Microsoft uses Linked servers. DB/2 uses DB/2 Connect Informix uses Informix Connect Etc. At CONNX, our product suite provides this ability generically from any data source collection. It is obvious why such a thing is utterly mandatory for every large business. For example: The business purchases a CRM system for customer relationship management like SAP. The business purchases a HCM system for Human Capital Management like Peoplesoft. The business purchases a Manufacturing system like MAXIM for their manufacturing systems. Etc., etc., etc. Some of these systems may have the same database type, but it is highly unlikely that every solution to a business problem in the entire organization uses the same underlying database. People buy or build software systems to solve their business problems. There is a low probability that each and every business problem was solved by the same sets of tools from the same vendors. Therefore, the ability to process queries across heterogeneous systems is a fundamental business need. The larger the company the more database systems you will find. But even teeny-tiny organizations tend to have several different database systems needed to run their business. {snip}
[HACKERS] What was the exact version of PostgreSQL where the column name length changed from 31 to 63 characters?
I need to know so that I can handle cases like: Create table foolongcols( nevermindthefurthermorejustpleadinselfdefense char(5), nevermindthefurthermorejustpleadguilty char(5) ); I assume that other object names (table name, function name, etc.) are similarly affected. Is that correct? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What was the exact version of PostgreSQL where the column name length changed from 31 to 63 characters?
-Original Message- From: Jeff Davis [mailto:pg...@j-davis.com] Sent: Tuesday, April 26, 2011 11:44 AM To: Dann Corbit Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] What was the exact version of PostgreSQL where the column name length changed from 31 to 63 characters? On Tue, 2011-04-26 at 18:35 +, Dann Corbit wrote: I need to know so that I can handle cases like: Create table foolongcols( nevermindthefurthermorejustpleadinselfdefense char(5), nevermindthefurthermorejustpleadguilty char(5) ); I assume that other object names (table name, function name, etc.) are similarly affected. Is that correct? It was changed in this commit: http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=46bb23 ac016714065711cf2a780e080c7310d66e which was first released in 7.3.0, as far as I can tell. Thanks (and also to Merlin Moncure) for finding this information. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ORDER BY 1 COLLATE
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Andrew Dunstan Sent: Monday, April 18, 2011 1:43 PM To: Tom Lane Cc: Peter Eisentraut; pgsql-hackers Subject: Re: [HACKERS] ORDER BY 1 COLLATE On 04/18/2011 04:20 PM, Tom Lane wrote: Peter Eisentrautpete...@gmx.net writes: This came from a review by Noah Misch a great while ago: test= SELECT b FROM foo ORDER BY 1 COLLATE C; ERROR: 42804: collations are not supported by type integer According to SQL92, this should be supported. Do we want to bother? It doesn't look hard to fix, so it's really only a question of whether this would be useful, or its absence would be too confusing. The ORDER BY 1 business seems to me to be legacy anyway. I'm not inclined to put in even more hacks to make strange combinations work there --- I think we're likely to find ourselves painted into a corner someday as it is. It's likely to be used by SQL generators if nothing else, and I've been known to use it as a very convenient shorthand. It would seem to me like quite a strange inconsistency to allow order by n with some qualifiers but not others. I use order by result_set_column_number a lot, especially when result_set_column is a complicated expression. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Postgres 9.1 - Release Theme
Smells like April first to me. http://en.wikipedia.org/wiki/April_Fools'_Day From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Rajasekhar Yakkali Sent: Friday, April 01, 2011 10:08 AM To: dp...@postgresql.org Cc: pgsql-gene...@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [GENERAL] Postgres 9.1 - Release Theme Following a great deal of discussion, I'm pleased to announce that the PostgreSQL Core team has decided that the major theme for the 9.1 release, due in 2011, will be 'NoSQL'. ... the intention is to remove SQL support from Postgres, and replace it with a language called 'QUEL'. This will provide us with the flexibility we need to implement the features of modern NoSQL databases. With no SQL support there will obviously be some differences in the query syntax that must be used to access your data. hmm.. shock it is this shift for 9.1 due in mid 2011 is unexpectedly soon :) Curious to understand as to - how this relates to every feature that is provide at the moment based on RDBMS paradigm. ACID compliance, support for the features provided by SQL, referential integrity, joins, caching etc, .. - Also does this shift take into an assumption that all the use cases fit the likes of data access patterns usecases similar to facebook/twitter? or to address the the likes of those ? Thanks, Raj
Re: [HACKERS] would hw acceleration help postgres (databases in general) ?
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Hamza Bin Sohail Sent: Friday, December 10, 2010 3:10 PM To: pgsql-hackers@postgresql.org Subject: [HACKERS] would hw acceleration help postgres (databases in general) ? Hello hackers, I think i'm at the right place to ask this question. Based on your experience and the fact that you have written the Postgres code, can you tell what a rough break-down - in your opinion - is for the time the database spends time just fetching and writing stuff to memory and the actual computation. The reason i ask this is because off-late there has been a push to put reconfigurable hardware on processor cores. What this means is that database writers can possibly identify the compute-intensive portions of the code and write hardware accelerators and/or custom instructions and offload computation to these hardware accelerators which they would have programmed onto the FPGA. There is not much utility in doing this if there aren't considerable compute- intensive operations in the database (which i would be surprise if true ). I would suspect joins, complex queries etc may be very compute-intensive. Please correct me if i'm wrong. Moreover, if you were told that you have a reconfigurable hardware which can perform pretty complex computations 10x faster than the base, would you think about synthesizing it directly on an fpga and use it ? I'd be more than glad to hear your guesstimates. Here is a sample project: http://www.cs.virginia.edu/~skadron/Papers/bakkum_sqlite_gpgpu10.pdf And another: http://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/ngm/15-823/project/Final.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: First step towards Intelligent,integrated database
I am probably just being thick, but how is your idea different from create domain: http://www.postgresql.org/docs/current/static/sql-createdomain.html From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of ghatpa...@vsnl.net Sent: Wednesday, December 01, 2010 2:20 AM To: pgsql hackers Subject: [HACKERS] Proposal: First step towards Intelligent,integrated database Importance: High Hello, Here is the proposal: My 1st step towards Intelligent, Integrated database. I am explaining the proposal with the use of example. Example: We will have a master table say CustMast and a transaction table say salesOrder table. View of CustMast: CustCodeNumber(5), CustName Varchar(30), CustAdrsLine1 Varchar, CustAdrsLine2 varchar etc. View of SalesOrder: SordnoNumber(8), Sorddt date, CustCode Number(5) - present way of defining. Proposed way is: CustCode Object CustMast. --- New data type to be introduced called O Object and create table definition to be modified whenever data type is 'O', it will accept object name (in this case table name). Here I want to inform data definition that field and its data type is already defined in master table and use the same data type here and both tables are linked with this field. We will be using same field name in both tables if not along with table name field name is to be accepted in create table definition. Advantages: 1. Now database knows that custcode in salesorder is a foreign key, a table constraint can be created. It also knows that index to be created on this field. 2. In present situation select statement for selecting values from join of both tables will be Select sordno, sorddt, custcode, custname, custadrsline1 from salesorder, custmast where salesorder.custcode=custmast.custcode. 3. In proposed way we can write this statement as: Select sordno, sorddt, custcode, custname, custadrsline1 from salesorder (with proper changes in program which pickup values from select statement. 4. Field can be from another table in same database or from Excel sheet column. 5. Views need not be created as all tables are properly linked with each other in an application. 6. This is only first step and many advantages can be a result of brainstorm. 7.This will change RDBMS, Tools and ERP to next generation. For any clarifications pl contact. Pl give your feedback. Regards Many, Vijay Ghatpande. Cell: +91 9822456142.
Re: [HACKERS] Implementation of Date/Time Input Interpretation
From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Francis Markham Sent: Tuesday, June 22, 2010 7:13 PM To: pgsql-hackers@postgresql.org Subject: [HACKERS] Implementation of Date/Time Input Interpretation Greetings all, I am currently implementing a script to import data into postgres. I would like to apply the algorithm to detect date and time values, outlined at http://developer.postgresql.org/pgdocs/postgres/datetime-input-rules.html However, I am unfamiliar (and somewhat intimidated) by the postgres source tree. Would any kind person be able to point me to the source file(s) that implement the above algorithm? You will find it under \src\backend\utils\adt\datetime.c To import data into postgres, I guess that reading the date time routine is probably not what you want to do. If you want to move the data in using a compiled program then use an ODBC driver. PostgreSQL comes with a free one. OLEDB is another sensible alternative. Or JDBC if you want to use Java. If you want to bulk load lots of data at high speed, read up on the COPY command. If you just want to insert some rows using SQL, then simply perform an INSERT using PSQL or some other interface of your choice. What is it exactly that you are trying to accomplish?
Re: [HACKERS] Implementation of Date/Time Input Interpretation
-Original Message- From: Andrew Dunstan [mailto:and...@dunslane.net] Sent: Tuesday, June 22, 2010 7:47 PM To: Francis Markham Cc: Dann Corbit; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Implementation of Date/Time Input Interpretation Francis Markham wrote: Thank you for your prompt reply. What is it exactly that you are trying to accomplish? I want to be able to, from my own script, determine if postgres will be able to interpret a string as a date or time. If you can suggest a better way of accomplishing this beyond reimplementing your algorithm I would be happy to hear it! Call the appropriate input function in plpgsql and trap a data exception? These routines are going to be quite hard to mimic, I suspect. Getting postgres to do the work for you is probably a better way to go if you can. For the O.P.: This is the specification of the input format that is needed for date/time values: http://www.postgresql.org/docs/8.4/interactive/datatype-datetime.html See also: http://www.postgresql.org/docs/8.4/interactive/datetime-appendix.html If he wants to be able to simply validate date/time values before insertion, I would suggest a package like libmcal and pick out the file datetime.c, in particular: extern bool datevalid(int year,int mon,int mday); extern bool timevalid(int hour,int min,int sec); The PostgreSQL database routine has lots of fluff intended for interfacing with the database, etc. which makes a simpler approach easier if validation is what is wanted. Of course date/time/calendar libraries are available in just about every language. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC
Maybe this can be helpful: http://sourceforge.net/search/?type_of_search=softwords=fingerprint http://sourceforge.net/search/?type_of_search=softwords=image+recogniti on -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Anindya Jyoti Roy Sent: Monday, March 29, 2010 1:04 PM To: pgsql-hackers@postgresql.org Subject: [HACKERS] GSoC I have some idea of implementing am image database system, which will let you search against the image and fingerprint: The idea crudely is the following: I want to implement a image database system: This will have the following quality: 1 will store image along with the other attributes 2 the database search engine will be able to search for image also 3 it will list the matching images in the order of degree of match. 4 in this matching system I will likely use the system of dividing the image into important parts and match them. 5 The database will also contain fingerprints, that may be the primary key. 6 it will match the finger prints on the basis of the nodes and then making a bitmap of it and using the coordinates with some error it will match them 7 to include some accuracy in case of the angle of picture in case of search of equality we will introduce some logical amount of error. 8 this project can be coded by me as I have some previous experience in this type of project. 9 this will make a definitly good search engine as well as innovative and uncommon. About me: I am Sophomore of Indian Institute of Technology, kanpur in CSE BTech. I have a good academic record till date and a keen interest on Data Mining and Machine language as well as Image processing. I am attaching my resume in short form with this mail hoping it may help you to judge my credibility. I am strong in Coding and Maths and able to solve problems within given timeline. I can give a more detail explanation of my project. But I just wanted to give an outline here. Let me know the feedback. Thanks for your time, Your faithfully, Anindya Jyoti Roy Sophomore CSE Bteck IIT Kanpur -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenVMS?
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Rayson Ho Sent: Tuesday, February 16, 2010 9:39 AM To: David Fetter Cc: Andrew Dunstan; PG Hackers Subject: Re: [HACKERS] OpenVMS? On Tue, Feb 16, 2010 at 11:22 AM, David Fetter wrote: * Shell access from several accounts * Git or cvs client * Compiler tools * Perl of a fairly recent vintage * Outbound http access I had access to the HP testdrive before they closed it down (the Unix servers were down in Sept 08, but the VMS cluster was running till Jan 2010). They blocked all outbound internet access to the testdrive servers -- only telnet and ftp were allowed. Count me in for the OpenVMS porting effort (but I guess I will apply for an account seperately as I might port other things to OpenVMS in the future). I believe the porting effort is larger than a new Unix port but smaller than the Windows port, as most of the Unix and POSIX functions and system calls are supported on OpenVMS. MySQL has around 10 functions changed or written specifically for OpenVMS, most of those are related to utime(), $UMASK $UMASKDIR, open(), and Unix pathnames. (I think utime() support was added a few years ago to OpenVMS 7.3 8.0 -- so may be the code was added to MySQL for earlier VMS versions.) PostgreSQL uses fork(), which is not supported on OpenVMS. However, the techniques used by the WIN32 version of internal_forkexec() in src/backend/postmaster/postmaster.c give the VMS version a good starting point. For PostgreSQL, you will probably want to use LIB$SPAWN() as a rough equivalent to CreateProcess() on Windows http://www.sysworks.com.au/disk$vaxdocsep002/opsys/vmsos721/5932/5932pro_041.html Lastly, are we going to support ODS-2 disks?? And are we going to require GNV installed for building and running PostgreSQL on OpenVMS?? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] MonetDB test says that PostgreSQL often has errors or missing results
See: http://monetdb.cwi.nl/SQL/Benchmark/TPCH/ If the result is correct, then the problem queries should be added to the regression test suite. If the result is not correct, then perhaps they could get assistance on proper configuration of PostgreSQL and rerun the tests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Source code for pg_bulkload
How can I obtain the source code for pg_bulkload? I am interested in writing an API version, so that I can imbed this loading facility into programs. The page http://pgfoundry.org/projects/pgbulkload/ has links to the binaries, but I want the source. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] planner or statistical bug on 8.5
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Pavel Stehule Sent: Monday, January 11, 2010 11:55 PM To: PostgreSQL Hackers Cc: Tom Lane Subject: [HACKERS] planner or statistical bug on 8.5 Hello I checked query and I was surprised with very strange plan: postgres=# create table a(a int primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index a_pkey for table a CREATE TABLE postgres=# create table b(b int primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index b_pkey for table b CREATE TABLE postgres=# create table c(c int primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index c_pkey for table c CREATE TABLE postgres=# ANALYZE ; ANALYZE postgres=# explain select a, b from a,b,c; QUERY PLAN --- Nested Loop (cost=0.00..276595350.00 rows=1382400 width=8) - Nested Loop (cost=0.00..115292.00 rows=576 width=8) - Seq Scan on a (cost=0.00..34.00 rows=2400 width=4) - Materialize (cost=0.00..82.00 rows=2400 width=4) - Seq Scan on b (cost=0.00..34.00 rows=2400 width=4) - Materialize (cost=0.00..82.00 rows=2400 width=0) - Seq Scan on c (cost=0.00..34.00 rows=2400 width=0) (7 rows) You have no join columns, so it is a simple product. Perhaps you meant something like this: EXPLAIN SELECT a.a, b.b, c.c FROM a a, b b, c c WHERE a.a = b.b AND a.a = c.c -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Bernd Helmle Sent: Thursday, July 16, 2009 8:47 AM To: Grzegorz Jaskiewicz Cc: pgsql-hackers Hackers Subject: Re: [HACKERS] boolean in C --On 16. Juli 2009 13:32:03 +0100 Grzegorz Jaskiewicz g...@pointblue.com.pl wrote: oh, another thing. stdbool is C99 standard feature. Not gcc extension. There might be compiler versions out there which claims to be C99 but do not provide full compliant include headers. SUN Studio 12 at least has the following in its documentation, as a quick research brings up: Though the compiler defaults to supporting the features of C99 listed below, standard headers provided by the Solaris software in /usr/include do not yet conform with the 1999 ISO/IEC C standard It's more or less a generic problem. There is only a handful of fully functional C99 compilers[0], and all the others have Some c99 features to one degree or another. Microsoft's compiler is particularly abysmal, but then again, they have no claims of C99 compliance so there is nothing to complain about there. Those few features that they do implement are implemented in a non-standard way. GCC is also only partially compliant[1]. I believe that the Dinkum library is the only certified C99 standard library[2] as well. [0] see: http://www.peren.com/pages/branding_set.htm [1] see: http://gcc.gnu.org/c99status.html [2] see: http://www.dinkumware.com/manuals/ I don't think that a product (that is expected to run on as many platforms as PostgreSQL is expected to run on) is even possible to write in C99 code because there are not enough compliant compilers available. IMO-YMMV -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Throw some low-level C scutwork at me
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Robert Haas Sent: Friday, May 01, 2009 6:00 PM To: Andy Lester Cc: Dimitri Fontaine; PostgreSQL-development Subject: Re: [HACKERS] Throw some low-level C scutwork at me On Fri, May 1, 2009 at 4:35 PM, Andy Lester a...@petdance.com wrote: There should be nothing to maintain, if it's done right. Any line in the source tree will have to get maintained, or why would you spend any time writing it? I meant by hand. See doc/FAQ_DEV and those specific lines: I see no such file. Perhaps it doesn't get exported into the git mirror? Sorry if the push-back has been read as harsh, but I've got the (very personal) feeling that to become a contributor to PostgreSQL, you *will* have to be able to read this level of criticism back from the mail you send. I'm all for criticism of ideas. I wish there had been some in Tom's original mail. OK, so, when I initially started catching up on this thread, I was kind of feeling annoyed at Tom, and I still wish he'd say something along the lines of I did not mean to give offense and I'm sorry if my words came across in a way that I did not intend rather than just explaining why he reacted the way he did. I think it's just Tom's way. Higgins (from Pygmalion): The great secret, Eliza, is not having bad manners or good manners or any other particular sort of manners, but having the same manner for all human souls: in short, behaving as if you were in Heaven, where there are no third-class carriages, and one soul is as good as another. For comparison, here is a recent message from Tom addressed to me, along with my response: = -Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Tuesday, April 28, 2009 3:21 PM To: Dann Corbit Cc: Andrew Dunstan; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Building Postgresql under Windows question Dann Corbit dcor...@connx.com writes: From: Andrew Dunstan [mailto:and...@dunslane.net] Why won't PQstatus(conn) == CONNECTION_OK be true and thus the code will succeed without requiring a password? It returns the value CONNECTION_STARTED It certainly shouldn't. You're effectively asserting that PQconnectdb is broken for everyone on every platform, which is demonstrably not the case. Are you fooling with modified libpq code by any chance? No. The service works correctly when I use password authentication. The service does not work correctly when I use trust. What happens when you use trust in pg_hba.conf? = Now, I do not think that Tom is a bad person at all. Quite the contrary, he's clearly very smart and also goes to great lengths in attempts to be helpful (sometimes chewing someone out is being helpful as well). He sometimes comes off as brusque -- but to some degree that comes from reading between the lines and being over sensitive. I think the lesson to be learned here is that when making any sort of message sent to the internet, the most sensible policy is to grow a skin at least five inches thick. I am certainly glad that Tom is on the PostgreSQL team. I expect that the product will come out ahead from it in the long run. And as for being offended, I am reminded of a Bible proverb: (Ecclesiastes 7:9) Do not hurry yourself in your spirit to become offended, for the taking of offense is what rests in the bosom of the stupid ones. That having been said, as far as I can tell, your feeling that Tom said something rude is based largely on the fact that he used the word sucked, and perhaps the phrase rejected out of hand. Admittedly, Tom could have described why he thought it sucked rather than just saying that it did, and he could have said that he would vote against accepting it and believed that others would not like it either rather than phrasing it in the way that he did. Then again, you didn't offer any justification for your desire to have them in there either. You didn't ask whether they'd been previously considered or whether the community would find them desirable. You didn't make an argument for why they'd be better than the system currently in use or even, at least as far as can be determined from reading the email that set off this flame war, take the time to understand that system before proposing your own. One thing I have discovered about pgsql-hackers is that it is very easy to be accused of not having done your homework even if you actually have. I have seen more than one well-thought-out proposal shot down by a committer who (as it seemed to me) had thought it through less carefully than the person proposing it. On the other hand, there are five or ten half-baked ideas for every good one, so the committers have
[HACKERS] Building Postgresql under Windows question
We are still having trouble with the service controller pg_ctl.exe vanishing after some period of time. Hence I am doing debug builds from the source tree according to the instructions found at: http://developer.postgresql.org/pgdocs/postgres/install-win32-full.html Specifically, I am using this method: It is also possible to build from inside the Visual Studio GUI. In this case, you need to run: perl mkvcbuild.pl There are a few projects that will not build (only 7 out of 99). I have not installed ossp-uuid yet, so one of the failures is expected. On the CONTRIB projects I am not concerned at all. However, on the internationalization failures, I want to know what the failures mean. Here is the output of the debug build process from the Visual Studio IDE (includes all failures): 1-- Build started: Project: utf8_and_euc_kr, Configuration: Debug Win32 -- 1Generate DEF file 1Not re-generating UTF8_AND_EUC_KR.DEF, file already exists. 1Linking... 1utf8_and_euc_kr.def : error LNK2001: unresolved external symbol euc_jp_to_utf8 1utf8_and_euc_kr.def : error LNK2001: unresolved external symbol pg_finfo_euc_jp_to_utf8 1utf8_and_euc_kr.def : error LNK2001: unresolved external symbol pg_finfo_utf8_to_euc_jp 1utf8_and_euc_kr.def : error LNK2001: unresolved external symbol utf8_to_euc_jp 1Debug\utf8_and_euc_kr\utf8_and_euc_kr.lib : fatal error LNK1120: 4 unresolved externals 2-- Build started: Project: utf8_and_euc_cn, Configuration: Debug Win32 -- 3-- Build started: Project: euc_kr_and_mic, Configuration: Debug Win32 -- 1Build log was saved at file://c:\dcorbit64\postgresql\postgresql-8.3.7\Debug\utf8_and_euc_kr\B uildLog.htm 1utf8_and_euc_kr - 5 error(s), 0 warning(s) 2Generate DEF file 3Generate DEF file 3Not re-generating EUC_KR_AND_MIC.DEF, file already exists. 2Not re-generating UTF8_AND_EUC_CN.DEF, file already exists. 3Linking... 2Linking... 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol euc_jp_to_mic 2utf8_and_euc_cn.def : error LNK2001: unresolved external symbol koi8r_to_utf8 2utf8_and_euc_cn.def : error LNK2001: unresolved external symbol pg_finfo_koi8r_to_utf8 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol euc_jp_to_sjis 2utf8_and_euc_cn.def : error LNK2001: unresolved external symbol pg_finfo_utf8_to_koi8r 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol mic_to_euc_jp 2utf8_and_euc_cn.def : error LNK2001: unresolved external symbol utf8_to_koi8r 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol mic_to_sjis 2Debug\utf8_and_euc_cn\utf8_and_euc_cn.lib : fatal error LNK1120: 4 unresolved externals 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_euc_jp_to_mic 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_euc_jp_to_sjis 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_mic_to_euc_jp 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_mic_to_sjis 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_sjis_to_euc_jp 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_sjis_to_mic 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol sjis_to_euc_jp 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol sjis_to_mic 3Debug\euc_kr_and_mic\euc_kr_and_mic.lib : fatal error LNK1120: 12 unresolved externals 2Build log was saved at file://c:\dcorbit64\postgresql\postgresql-8.3.7\Debug\utf8_and_euc_cn\B uildLog.htm 2utf8_and_euc_cn - 5 error(s), 0 warning(s) 4-- Build started: Project: utf8_and_shift_jis_2004, Configuration: Debug Win32 -- 4Generate DEF file 3Build log was saved at file://c:\dcorbit64\postgresql\postgresql-8.3.7\Debug\euc_kr_and_mic\Bu ildLog.htm 3euc_kr_and_mic - 13 error(s), 0 warning(s) 4Not re-generating UTF8_AND_SHIFT_JIS_2004.DEF, file already exists. 5-- Build started: Project: uuid-ossp, Configuration: Debug Win32 -- 5Compiling... 4Linking... 5uuid-ossp.c 4utf8_and_shift_jis_2004.def : error LNK2001: unresolved external symbol pg_finfo_uhc_to_utf8 4utf8_and_shift_jis_2004.def : error LNK2001: unresolved external symbol pg_finfo_utf8_to_uhc 4utf8_and_shift_jis_2004.def : error LNK2001: unresolved external symbol uhc_to_utf8 4utf8_and_shift_jis_2004.def : error LNK2001: unresolved external symbol utf8_to_uhc 4Debug\utf8_and_shift_jis_2004\utf8_and_shift_jis_2004.lib : fatal error LNK1120: 4 unresolved externals 5.\contrib\uuid-ossp\uuid-ossp.c(27) : fatal error C1083: Cannot open include file: 'uuid.h': No such file or directory 5Build log was saved at file://c:\dcorbit64\postgresql\postgresql-8.3.7\Debug\uuid-ossp\BuildLo g.htm 5uuid-ossp - 1 error(s), 0 warning(s) 4Build log was saved at file://c:\dcorbit64\postgresql\postgresql-8.3.7\Debug\utf8_and_shift_ji s_2004\BuildLog.htm 4utf8_and_shift_jis_2004 - 5 error(s), 0 warning(s) 6-- Build started: Project: hstore, Configuration: Debug Win32 -- 6Generate DEF file 6Not
Re: [HACKERS] Building Postgresql under Windows question
-Original Message- From: Andrew Dunstan [mailto:and...@dunslane.net] Sent: Tuesday, April 28, 2009 12:42 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Building Postgresql under Windows question Dann Corbit wrote: Pg_ctl.exe is exiting with a success code from line 1946 of PG_CTL.C This is not appropriate behavior for a service unless shutdown has been requested. pg_ctl calls *StartServiceCtrlDispatcher*(). It can only get to the line you mention when called as a service after that call returns. MSDN states: If *StartServiceCtrlDispatcher* succeeds, it connects the calling thread to the service control manager and does not return until all running services in the process have entered the SERVICE_STOPPED state. So it appears that something is causing your service to enter that state. It is interesting that it happens even if I run no queries at all. This is the only reference to the service control dispatcher I can find in pg_ctl.c: static void pgwin32_doRunAsService(void) { SERVICE_TABLE_ENTRY st[] = {{register_servicename, pgwin32_ServiceMain}, {NULL, NULL}}; if (StartServiceCtrlDispatcher(st) == 0) { write_stderr(_(%s: could not start service \%s\: error code %d\n), progname, register_servicename, (int) GetLastError()); exit(1); } } BTW, the exit(1) calls should be exit(EXIT_FAILURE) though there will be no difficulty on any POSIX system. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Building Postgresql under Windows question
-Original Message- From: Andrew Dunstan [mailto:and...@dunslane.net] Sent: Tuesday, April 28, 2009 12:42 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Building Postgresql under Windows question Dann Corbit wrote: Pg_ctl.exe is exiting with a success code from line 1946 of PG_CTL.C This is not appropriate behavior for a service unless shutdown has been requested. pg_ctl calls *StartServiceCtrlDispatcher*(). It can only get to the line you mention when called as a service after that call returns. MSDN states: If *StartServiceCtrlDispatcher* succeeds, it connects the calling thread to the service control manager and does not return until all running services in the process have entered the SERVICE_STOPPED state. So it appears that something is causing your service to enter that state. It appears that SERVICE_STOPPED comes from here: static void WINAPI pgwin32_ServiceMain(DWORD argc, LPTSTR * argv) { PROCESS_INFORMATION pi; DWORD ret; DWORD check_point_start; /* Initialize variables */ status.dwWin32ExitCode = S_OK; status.dwCheckPoint = 0; status.dwWaitHint = 6; status.dwServiceType = SERVICE_WIN32_OWN_PROCESS; status.dwControlsAccepted = SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN | SERVICE_ACCEPT_PAUSE_CONTINUE; status.dwServiceSpecificExitCode = 0; status.dwCurrentState = SERVICE_START_PENDING; memset(pi, 0, sizeof(pi)); read_post_opts(); /* Register the control request handler */ if ((hStatus = RegisterServiceCtrlHandler(register_servicename, pgwin32_ServiceHandler)) == (SERVICE_STATUS_HANDLE) 0) return; if ((shutdownEvent = CreateEvent(NULL, true, false, NULL)) == NULL) return; /* Start the postmaster */ pgwin32_SetServiceStatus(SERVICE_START_PENDING); if (!CreateRestrictedProcess(pgwin32_CommandLine(false), pi, true)) { pgwin32_SetServiceStatus(SERVICE_STOPPED); return; } postmasterPID = pi.dwProcessId; postmasterProcess = pi.hProcess; CloseHandle(pi.hThread); if (do_wait) { write_eventlog(EVENTLOG_INFORMATION_TYPE, _(Waiting for server startup...\n)); if (test_postmaster_connection(true) == false) { write_eventlog(EVENTLOG_INFORMATION_TYPE, _(Timed out waiting for server startup\n)); pgwin32_SetServiceStatus(SERVICE_STOPPED); return; } write_eventlog(EVENTLOG_INFORMATION_TYPE, _(Server started and accepting connections\n)); } /* * Save the checkpoint value as it might have been incremented in * test_postmaster_connection */ check_point_start = status.dwCheckPoint; pgwin32_SetServiceStatus(SERVICE_RUNNING); /* Wait for quit... */ ret = WaitForMultipleObjects(2, shutdownHandles, FALSE, INFINITE); pgwin32_SetServiceStatus(SERVICE_STOP_PENDING); switch (ret) { case WAIT_OBJECT_0: /* shutdown event */ kill(postmasterPID, SIGINT); /* * Increment the checkpoint and try again Abort after 12 * checkpoints as the postmaster has probably hung */ while (WaitForSingleObject(postmasterProcess, 5000) == WAIT_TIMEOUT status.dwCheckPoint 12) status.dwCheckPoint++; break; case (WAIT_OBJECT_0 + 1): /* postmaster went down */ break; default: /* shouldn't get here? */ break; } CloseHandle(shutdownEvent); CloseHandle(postmasterProcess); pgwin32_SetServiceStatus(SERVICE_STOPPED); } I will set a breakpoint on every place that the status is set to SERVICE_STOPPED and report what I have found. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Building Postgresql under Windows question
In this function: static bool test_postmaster_connection(bool do_checkpoint) This code will never succeed: snprintf(connstr, sizeof(connstr), dbname=postgres port=%s connect_timeout=5, portstr); for (i = 0; i wait_seconds; i++) { if ((conn = PQconnectdb(connstr)) != NULL (PQstatus(conn) == CONNECTION_OK || PQconnectionNeedsPassword(conn))) { PQfinish(conn); success = true; break; } Because pg_hba.conf has this: # TYPE DATABASEUSERCIDR-ADDRESS METHOD # IPv4 local connections: hostall all 127.0.0.1/32 trust # IPv6 local connections: hostall all ::1/128 trust to allow connections from the local machine. (A password is not needed to connect, so the code always fails). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Building Postgresql under Windows question
-Original Message- From: Andrew Dunstan [mailto:and...@dunslane.net] Sent: Tuesday, April 28, 2009 2:27 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Building Postgresql under Windows question Dann Corbit wrote: In this function: static bool test_postmaster_connection(bool do_checkpoint) This code will never succeed: snprintf(connstr, sizeof(connstr), dbname=postgres port=%s connect_timeout=5, portstr); for (i = 0; i wait_seconds; i++) { if ((conn = PQconnectdb(connstr)) != NULL (PQstatus(conn) == CONNECTION_OK || PQconnectionNeedsPassword(conn))) { PQfinish(conn); success = true; break; } Because pg_hba.conf has this: # TYPE DATABASEUSERCIDR-ADDRESS METHOD # IPv4 local connections: hostall all 127.0.0.1/32 trust # IPv6 local connections: hostall all ::1/128 trust to allow connections from the local machine. (A password is not needed to connect, so the code always fails). Why won't PQstatus(conn) == CONNECTION_OK be true and thus the code will succeed without requiring a password? It returns the value CONNECTION_STARTED I have found a work-around for now. If I set the method to password in pg_hba.conf, the service starts and runs correctly. It is only when the method is set to trust that we get 100% failures. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Building Postgresql under Windows question
Pg_ctl.exe is exiting with a success code from line 1946 of PG_CTL.C This is not appropriate behavior for a service unless shutdown has been requested. -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Dann Corbit Sent: Tuesday, April 28, 2009 11:47 AM To: pgsql-hackers@postgresql.org Subject: [HACKERS] Building Postgresql under Windows question We are still having trouble with the service controller pg_ctl.exe vanishing after some period of time. Hence I am doing debug builds from the source tree according to the instructions found at: http://developer.postgresql.org/pgdocs/postgres/install-win32-full.html Specifically, I am using this method: It is also possible to build from inside the Visual Studio GUI. In this case, you need to run: perl mkvcbuild.pl There are a few projects that will not build (only 7 out of 99). I have not installed ossp-uuid yet, so one of the failures is expected. On the CONTRIB projects I am not concerned at all. However, on the internationalization failures, I want to know what the failures mean. Here is the output of the debug build process from the Visual Studio IDE (includes all failures): 1-- Build started: Project: utf8_and_euc_kr, Configuration: Debug Win32 -- 1Generate DEF file 1Not re-generating UTF8_AND_EUC_KR.DEF, file already exists. 1Linking... 1utf8_and_euc_kr.def : error LNK2001: unresolved external symbol euc_jp_to_utf8 1utf8_and_euc_kr.def : error LNK2001: unresolved external symbol pg_finfo_euc_jp_to_utf8 1utf8_and_euc_kr.def : error LNK2001: unresolved external symbol pg_finfo_utf8_to_euc_jp 1utf8_and_euc_kr.def : error LNK2001: unresolved external symbol utf8_to_euc_jp 1Debug\utf8_and_euc_kr\utf8_and_euc_kr.lib : fatal error LNK1120: 4 unresolved externals 2-- Build started: Project: utf8_and_euc_cn, Configuration: Debug Win32 -- 3-- Build started: Project: euc_kr_and_mic, Configuration: Debug Win32 -- 1Build log was saved at file://c:\dcorbit64\postgresql\postgresql- 8.3.7\Debug\utf8_and_euc_kr\B uildLog.htm 1utf8_and_euc_kr - 5 error(s), 0 warning(s) 2Generate DEF file 3Generate DEF file 3Not re-generating EUC_KR_AND_MIC.DEF, file already exists. 2Not re-generating UTF8_AND_EUC_CN.DEF, file already exists. 3Linking... 2Linking... 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol euc_jp_to_mic 2utf8_and_euc_cn.def : error LNK2001: unresolved external symbol koi8r_to_utf8 2utf8_and_euc_cn.def : error LNK2001: unresolved external symbol pg_finfo_koi8r_to_utf8 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol euc_jp_to_sjis 2utf8_and_euc_cn.def : error LNK2001: unresolved external symbol pg_finfo_utf8_to_koi8r 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol mic_to_euc_jp 2utf8_and_euc_cn.def : error LNK2001: unresolved external symbol utf8_to_koi8r 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol mic_to_sjis 2Debug\utf8_and_euc_cn\utf8_and_euc_cn.lib : fatal error LNK1120: 4 unresolved externals 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_euc_jp_to_mic 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_euc_jp_to_sjis 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_mic_to_euc_jp 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_mic_to_sjis 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_sjis_to_euc_jp 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol pg_finfo_sjis_to_mic 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol sjis_to_euc_jp 3euc_kr_and_mic.def : error LNK2001: unresolved external symbol sjis_to_mic 3Debug\euc_kr_and_mic\euc_kr_and_mic.lib : fatal error LNK1120: 12 unresolved externals 2Build log was saved at file://c:\dcorbit64\postgresql\postgresql- 8.3.7\Debug\utf8_and_euc_cn\B uildLog.htm 2utf8_and_euc_cn - 5 error(s), 0 warning(s) 4-- Build started: Project: utf8_and_shift_jis_2004, Configuration: Debug Win32 -- 4Generate DEF file 3Build log was saved at file://c:\dcorbit64\postgresql\postgresql- 8.3.7\Debug\euc_kr_and_mic\Bu ildLog.htm 3euc_kr_and_mic - 13 error(s), 0 warning(s) 4Not re-generating UTF8_AND_SHIFT_JIS_2004.DEF, file already exists. 5-- Build started: Project: uuid-ossp, Configuration: Debug Win32 -- 5Compiling... 4Linking... 5uuid-ossp.c 4utf8_and_shift_jis_2004.def : error LNK2001: unresolved external symbol pg_finfo_uhc_to_utf8 4utf8_and_shift_jis_2004.def : error LNK2001: unresolved external symbol pg_finfo_utf8_to_uhc 4utf8_and_shift_jis_2004.def : error LNK2001: unresolved external symbol uhc_to_utf8 4utf8_and_shift_jis_2004.def : error LNK2001: unresolved external symbol utf8_to_uhc 4Debug\utf8_and_shift_jis_2004\utf8_and_shift_jis_2004.lib : fatal error LNK1120: 4
[HACKERS] One click installer and pgInstaller collide
If you install with the PostgreSQL group's installer for Windows: http://www.postgresql.org/ftp/binary And then install a second instance of PostgreSQL using the EnterpriseDB one click installer: http://www.enterprisedb.com/products/pgdownload.do#windows An error may occur. It will occur, probably, either with one system or the other. The PG Admin III produce uses a DLL called libiconv-2.dll One of these DLL files has an entry point called libiconv_set_relocation_prefix and the other one does not. When you reference the DLL file, it will complain about an entry point that could not be located. It is probably a fairly esoteric issue, since most people (if they install multiple instances of a product) will install only a single version of it. But we support everything under the sun, so I purposely try to install as many versions as possible for testing. Anyway, I thought you might like to know about this odd collision. 64 bit Windows platform. PG Installer version, installed first, was 8.3.5 EnterpriseDB version was 8.3.7 There was an error message (file in use) during installation of the second tool set. It is also possible that it was caused by the small step from 8.3.5 to 8.3.7 and the problem would have occurred with an upgrade using the same version (I did not test to find out). I do not know if this message will get to the PG group. None of my messages has shown up this week for some reason. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FW: Multiple character encodings within a single database/table?
-Original Message- From: Dann Corbit Sent: Monday, March 23, 2009 3:50 PM To: pgsql-gene...@postgresql.org Subject: Multiple character encodings within a single database/table? If I have the C locale, can I have multiple character encodings within: 1. A single database? 2. A single table? More specifically, I would like to be able to have Unicode columns and ASCII text columns within the same table. Is this possible? If so, how do I achieve it? It was not clear to me from: http://www.postgresql.org/docs/current/static/multibyte.html It seems to me from this statement: It can be overridden when you create a database, so you can have multiple databases each with a different character set. That it may be database wide, but I am not sure that it is not possible to have both ordinary char and Unicode in the same table. Possible or not? I know that this message is better placed on the general group, but I got no reply in 24 hours. So perhaps I can get a rise in hackers... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Windows installation service
The Windows installation service uses pg_ctl to perform the network start-up operation. This program starts up the postmaster and exits. The net effect of performing the operation in this manner is that the Windows service manager sees the service as not running a few minutes after the startup is complete. It also prevents proper pause and restart of the service. As a suggestion: Instead of installing pg_ctl as the service, start up postgres as the service. This is how we did our Windows port. If the idea is appealing to the PostgreSQL group, we can send our service code modifications for review as a possible alternative to the current method. Another approach that could be equally helpful (along the same lines) is to leave pg_ctl.exe in memory and allow it to control the program. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windows installation service
-Original Message- From: Dave Page [mailto:dp...@pgadmin.org] Sent: Friday, April 10, 2009 8:16 AM To: Dann Corbit Cc: pgsql-hackers@postgresql.org; Bill Luton; Larry McGhaw; Mike McKee; Brian Fifer Subject: Re: [HACKERS] Windows installation service On Mon, Apr 6, 2009 at 9:32 PM, Dann Corbit dcor...@connx.com wrote: The Windows installation service uses pg_ctl to perform the network start-up operation. This program starts up the postmaster and exits. The net effect of performing the operation in this manner is that the Windows service manager sees the service as not running a few minutes after the startup is complete. It also prevents proper pause and restart of the service. Per our offlist conversation, this is not how it works. As a suggestion: Instead of installing pg_ctl as the service, start up postgres as the service. This is how we did our Windows port. If the idea is appealing to the PostgreSQL group, we can send our service code modifications for review as a possible alternative to the current method. Another approach that could be equally helpful (along the same lines) is to leave pg_ctl.exe in memory and allow it to control the program. Which is what does happen. I don't know the reason why, but that is not what happens here. We see the problem on 64-bit machines with Windows 2008 Server. We see the problem on 32-bit machine with Windows 2003 Server. We see the problem on 32-bit Windows XP machines. It is universal (all of these machines demonstrate the problem). I did get this email from Mike McKee this morning: I noticed a pattern. The first time it works, and can shutdown. The second time is where it kind of hangs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windows installation service
-Original Message- From: Dann Corbit Sent: Friday, April 10, 2009 12:30 PM To: 'Dave Page' Cc: pgsql-hackers@postgresql.org; Bill Luton; Larry McGhaw; Mike McKee; Brian Fifer Subject: RE: [HACKERS] Windows installation service -Original Message- From: Dave Page [mailto:dp...@pgadmin.org] Sent: Friday, April 10, 2009 8:16 AM To: Dann Corbit Cc: pgsql-hackers@postgresql.org; Bill Luton; Larry McGhaw; Mike McKee; Brian Fifer Subject: Re: [HACKERS] Windows installation service On Mon, Apr 6, 2009 at 9:32 PM, Dann Corbit dcor...@connx.com wrote: The Windows installation service uses pg_ctl to perform the network start-up operation. This program starts up the postmaster and exits. The net effect of performing the operation in this manner is that the Windows service manager sees the service as not running a few minutes after the startup is complete. It also prevents proper pause and restart of the service. Per our offlist conversation, this is not how it works. As a suggestion: Instead of installing pg_ctl as the service, start up postgres as the service. This is how we did our Windows port. If the idea is appealing to the PostgreSQL group, we can send our service code modifications for review as a possible alternative to the current method. Another approach that could be equally helpful (along the same lines) is to leave pg_ctl.exe in memory and allow it to control the program. Which is what does happen. I don't know the reason why, but that is not what happens here. We see the problem on 64-bit machines with Windows 2008 Server. We see the problem on 32-bit machine with Windows 2003 Server. We see the problem on 32-bit Windows XP machines. It is universal (all of these machines demonstrate the problem). I did get this email from Mike McKee this morning: I noticed a pattern. The first time it works, and can shutdown. The second time is where it kind of hangs I should mention that PostgreSQL is still operational. The Postgres servers are in memory and I am able to perform queries, despite the apparent status of the service. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windows installation service
-Original Message- From: Dave Page [mailto:dp...@pgadmin.org] Sent: Friday, April 10, 2009 1:58 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org; Bill Luton; Larry McGhaw; Mike McKee; Brian Fifer Subject: Re: [HACKERS] Windows installation service On Fri, Apr 10, 2009 at 8:29 PM, Dann Corbit dcor...@connx.com wrote: I don't know the reason why, but that is not what happens here. We see the problem on 64-bit machines with Windows 2008 Server. We see the problem on 32-bit machine with Windows 2003 Server. We see the problem on 32-bit Windows XP machines. It is universal (all of these machines demonstrate the problem). I did get this email from Mike McKee this morning: I noticed a pattern. The first time it works, and can shutdown. The second time is where it kind of hangs So what is unusual about your machines that makes this happen for you, but apparently noone else we've heard of? Are you running on a domain? Yes. This is a corporation, so we don't run peer to peer. Any nonstandard security policy or other configuration? Nothing unusual. I know Connx have experience developing on Windows, so can one of you attach a debugger and see why the WaitForMultipleObjects() call in pg_ctl doesn't work on your systems? Here is what we tried: We decided to start the service from the command line as an executable rather than a service. First, we found permissions problems: C:\CONNX32\CONNXSTORE\binC:\CONNX32\CONNXSTORE\bin\pg_ctl.exe start -w -N pgsql-8.3 -D C:\CONNX32\CONNXSTORE\data\ waiting for server to start...2009-04-10 14:55:22 PDT LOG: loaded library $libdir/plugins/plugin_debugger.dll 2009-04-10 14:55:22 PDT PANIC: could not open control file global/pg_control: Permission denied This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. could not start server We allowed user postgres to have full control for the global folder. C:\CONNX32\CONNXSTORE\binC:\CONNX32\CONNXSTORE\bin\pg_ctl.exe start -w -N pgsql-8.3 -D C:\CONNX32\CONNXSTORE\data\ waiting for server to start...2009-04-10 15:09:50 PDT LOG: loaded library $libdir/plugins/plugin_debugger.dll 2009-04-10 15:09:50 PDT LOG: could not create file postmaster.opts: Permission denied ... done server started We had a similar problem with the data folder, so we changed permissions for user postgres to have full control for the data folder. C:\CONNX32\CONNXSTORE\binC:\CONNX32\CONNXSTORE\bin\pg_ctl.exe start -w -N pgsql-8.3 -D C:\CONNX32\CONNXSTORE\data\ waiting for server to start...2009-04-10 15:19:49 PDT LOG: loaded library $libdir/plugins/plugin_debugger.dll could not start server After those changes, we no longer had permissions problems but the server did not start from the command line. I uninstalled Postgresql I removed the postgres user I removed the postgres directory structure I installed PostgreSQL postgresql-8.3.7-1. Apparently this version does not have the same bad symptoms (I have not checked all the other machines yet -- that will take some time). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Is there an official log reader for PostgreSQL?
I am interested in doing a near real time transaction processor. If there is a place to find the specification for the log reader, I would be keen to know it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there an official log reader for PostgreSQL?
-Original Message- From: Dimitri Fontaine [mailto:dfonta...@hi-media.com] Sent: Wednesday, March 04, 2009 12:28 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Is there an official log reader for PostgreSQL? Hi, Le 4 mars 09 à 21:17, Dann Corbit a écrit : I am interested in doing a near real time transaction processor. If there is a place to find the specification for the log reader, I would be keen to know it. Have you read about PGQ yet? Some links include: http://www.pgcon.org/2008/schedule/events/79.en.html http://kaiv.wordpress.com/2007/10/19/skytools-database-scripting- framework-pgq/ http://skytools.projects.postgresql.org/doc/pgq-admin.html http://skytools.projects.postgresql.org/doc/pgq-sql.html Not yet. I will read it. Thanks. If your processing is to be coded in python, the facilities are provided in the current skytools releases. If your processing is to be coded in PHP, the facilities exists in the CVS and will get distributed soon'ish. If your processing is to be coded in any other language, you need SQL access and will be able to use the SQL level API directly, which provides all the raw features. The project will be coded in C++. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Kerberos V5 required for PostgreSQL installation on Windows
If Kerberos V5 is not installed on a Windows platform, the following error dialog is returned upon attempted installation: Posgres.exe - Unable to Locate Component This application has failed to start because krb5_32.dll was not found. Re-installing the application may fix this problem. [OK] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is a plan for lmza commpression in pg_dump
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Bruce Momjian Sent: Wednesday, February 04, 2009 3:28 PM To: Stanislav Lacko Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Is a plan for lmza commpression in pg_dump Stanislav Lacko wrote: Hi. Is it in todo or in a plan to implement lmza commpression in pg_dump backups? Nope, never heard anything about it. In case the PG group does get interested in insertion of compression algorithms into PostgreSQL {it seems it could be useful in many different areas}, the 7zip format seems to be excellent in a number of ways. Here is an interesting benchmark that shows 7z format winning a large area of the optimal compressors performance graph: http://users.elis.ugent.be/~wheirman/compression/ The LZMA SDK is granted to the public domain: http://www.7-zip.org/sdk.html Unfortunately LZOP (which wins the top half of the optimal compressors graph where the compression and decompression speed is more important than amount of compression) does not have a liberal license. http://www.lzop.org/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4 release planning
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Joshua D. Drake Sent: Monday, January 26, 2009 7:42 PM To: KaiGai Kohei Cc: Tom Lane; Ron Mayer; Josh Berkus; Robert Haas; Merlin Moncure; Jonah H. Harris; Gregory Stark; Simon Riggs; Bruce Momjian; Bernd Helmle; Peter Eisentraut; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] 8.4 release planning On Tue, 2009-01-27 at 12:30 +0900, KaiGai Kohei wrote: BTW, I have not walked on water yet. I have but I always end up wet. :) I find that it helps to freeze the water first. ;-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simple postgresql.conf wizard
-Original Message- From: Greg Stark [mailto:[EMAIL PROTECTED] On Behalf Of Gregory Stark Sent: Tuesday, November 25, 2008 5:06 PM To: Decibel! Cc: Tom Lane; Dann Corbit; Robert Haas; Bruce Momjian; Mark Wong; Heikki Linnakangas; Josh Berkus; Greg Smith; pgsql- [EMAIL PROTECTED] Subject: Re: [HACKERS] Simple postgresql.conf wizard [snip] As Dann said, the idea that there IS a magic number is the problem. *Any* value of default_stats_target will cause problems. Some columns will always have skewed data sets which require unusually large samples, but most won't and the system will run faster with a normal sample size for that majority. No, it was somebody smarter than me who said that. My idea was to create some kind of table which shows curves for different values and then users will have some sort of basis for choosing. Of course, the guy who has 40 tables in his join with an average of 7 indexes on each table (each table containing millions of rows) and a convoluted WHERE clause will have different needs than someone who has simple queries and small data loads. The quality of the current statistical measures stored will also affect the intelligence of the query preparation process, I am sure. I do have a guess that larger and more expensive queries can probably benefit more from larger samples (this principle is used in sorting, for instance, where the sample I collect to estimate the median might grow as {for instance} the log of the data set size). P.S. I also do not believe that there is any value that will be the right answer. But a table of data might be useful both for people who want to toy with altering the values and also for those who want to set the defaults. I guess that at one time such a table was generated to produce the initial estimates for default values. [snip] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simple postgresql.conf wizard -- Statistics idea...
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 25, 2008 5:33 PM To: Dann Corbit Cc: Gregory Stark; Decibel!; Robert Haas; Bruce Momjian; Mark Wong; Heikki Linnakangas; Josh Berkus; Greg Smith; pgsql- [EMAIL PROTECTED] Subject: Re: [HACKERS] Simple postgresql.conf wizard Dann Corbit [EMAIL PROTECTED] writes: I also do not believe that there is any value that will be the right answer. But a table of data might be useful both for people who want to toy with altering the values and also for those who want to set the defaults. I guess that at one time such a table was generated to produce the initial estimates for default values. Sir, you credit us too much :-(. The actual story is that the current default of 10 was put in when we first implemented stats histograms, replacing code that kept track of only a *single* most common value (and not very well, at that). So it was already a factor of 10 more stats than we had experience with keeping, and accordingly conservatism suggested not boosting the default much past that. So we really don't have any methodically-gathered evidence about the effects of different stats settings. It wouldn't take a lot to convince us to switch to a different default, I think, but it would be nice to have more than none. I do have a statistics idea/suggestion (possibly useful with some future PostgreSQL 9.x or something): It is a simple matter to calculate lots of interesting univarate summary statistics with a single pass over the data (perhaps during a vacuum full). For instance with numerical columns, you can calculate mean, min, max, standard deviation, skew, kurtosis and things like that with a single pass over the data. Here is a C++ template I wrote to do that: http://cap.connx.com/tournament_software/STATS.HPP It also uses this template: http://cap.connx.com/tournament_software/Kahan.Hpp which is a high-accuracy adder. These things could easily be rewritten in C instead of C++. Now, if you store a few numbers calculated in this way, it can be used to augment your histogram data when you want to estimate the volume of a request. So (for instance) if someone asks for a scalar that is value you can look to see what percentage of the tail will hang out in that neck of the woods using standard deviation and the mean. I have another, similar idea (possibly useful someday far off in the future) that I think may have some merit. The idea is to create a statistical index. This index is updated whenever data values are modified in any way. For scalar/ordinal values such as float, integer, numeric it would simply store and update a statistics accumulator (a small vector of a few items holding statistical moments, counts and sums) for the column of interest. These indexes would be very small and inexpensive to {for instance} memory map. For categorical values (things like 'color' or 'department') we might store the count for the number of items that correspond to a hash in our statistical index. It would give you a crude count distinct for any item -- the only caveat being that more than one item could possibly have the same hash code (we would also keep a count of null items). If the count needed to be exact, we could generate a perfect hash for the data or store each distinct column value in the categorical index with its hash. The size of such an index would depend on the data so that bit or char='m'/'f' for male/female or 'y'/'n' for yes/no indexes would contain just two counts and a column that is unique would have one hash paired with the number one for each row of the table (a regular unique index would clearly be better in such a case, but such distributions could easily arise). The value of an index like this is that it is good where the normal index types are bad (e.g. it is useless to create a btree index on a bit column or male/female, yes/no -- things of that nature but a statistics counter would work nicely -- to give you whole table measures only of course). The thing that is odd about these statistics indexes is that they do not even bother to point back to the data that they represent -- they are just abstract measures of the general contents. It seems to me that this kind of information could be used to improve the query plans for both categorical values and scalars and also be used to generate instant answers for some kinds of statistical queries. If used only for query planning, the values would not need to be exact, and could be updated only at vacuum full time or some other convenient time. The notion behind creation of a stats index would be that we may not need to maintain this detailed information for every column, but we can maintain such data for columns that we often filter with in our queries to get an idea of cardinality for a subset. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http
Re: [HACKERS] Simple postgresql.conf wizard
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Robert Haas Sent: Wednesday, November 19, 2008 5:05 PM To: Bruce Momjian Cc: Mark Wong; Tom Lane; Heikki Linnakangas; Gregory Stark; Josh Berkus; Greg Smith; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Simple postgresql.conf wizard Yes, please test something. I am tired if us saying we need to increase default_statistics_target, but because we don't know the magic number, we do nothing release after release. I think the idea that there IS a magic number is the problem. No amount of testing is ever going to refute the argument that, under some other workload, a different value might better. But that doesn't amount to a reason to leave it the way it is. Perhaps a table of experimental data could serve as a rough guideline. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Installation oddity -- installer insists that PostgreSQL has failed to start, even though it is started and functioning correctly
Sorry about the HTML post, but I wanted to include the error form. When trying to perform an upgrade with PostgreSQL 8.3.5 over 8.3.4, the installer insists that PostgreSQL has failed to start. However, I can see the processes in memory (it has indeed started) and even attach with PG Admin III and perform queries. PG Admin III confirms the version string as: PostgreSQL 8.3.5, compiled by Visual C++ build 1400. My machine has the following parameters: CrystalCPUID Pure x64 Edition 4.15.2.451 (C) 2002-2008 hiyohiyo Crystal Dew World : http://crystalmark.info/ OS : Windows Server 2008 Server Standard Edition (full installation) SP1 [ 6.0 Build 6001 ] Date : 2008/11/18 13:11:48 CPU Name : Intel Core 2 Extreme (Yorkfield) Vendor String : GenuineIntel Name String : Intel(R) Core(TM)2 Extreme CPU X9650 @ 3.00GHz Architecture : x64 Process Rule : 45 nm Platform : LGA775 [4] CPU Type : Original OEM processor [0] Number (Total) : 4 Physical Core : 4 Family : 6h Ext. Family : 6h Model : 7h Ext. Model : 17h Stepping : 6h Ext. Stepping : 6h Microcode ID : 60Bh Feature : MMX SSE SSE2 SSE3 SSSE3 SSE4.1 XD VT Intel 64 PowerManagement : SpeedStep CurrentOriginal Clock : 2992.45 MHz 3000.00 MHz System Clock : 332.49 MHz 333.33 MHz System Bus : 1329.98 MHz 1333.33 MHz Multiplier :9.009.00 Data Rate :QDR Over Clock : -0.25 % L1 I-Cache : 32 KB L1 D-Cache : 32 KB L2 Cache : 6144 KB [Full:2992.45 MHz] Memory : 4029 MB CPUID: EAX EBX ECX EDX -- : 000A 756E6547 6C65746E 49656E69 0001 : 00010676 00040800 0008E3BD BFEBFBFF 0002 : 05B0B101 005657F0 2CB4304E 0003 : 0004 : 0005 : 0040 0040 0003 0020 0006 : 0001 0002 0001 0007 : 0008 : 0400 0009 : 000A : 07280202 0503 8000 : 8008 8001 : 0001 20100800 8002 : 65746E49 2952286C 726F4320 4D542865 8003 : 45203229 65727478 4320656D 58205550 8004 : 30353639 20402020 30302E33 007A4847 8005 : 8006 : 18008040 8007 : 8008 : 3024 MSR : EAX1 EDX1 EAX2 EDX2 : 42480800 88C4091F 0C10 QPF : 14.318180 MHz Chipset Information Chipset : Intel Unknown North Bridge : [8086:29E0.00] Intel Unknown South Bridge : [8086:2916.02] Intel 82801IR (ICH9R) Video Chip : [10DE:040E.A1] NVIDIA Unknown BIOS Information by WMI (Windows Management Interface) BIOS Vendor : Dell Inc. BIOS Caption : Phoenix ROM BIOS PLUS Version 1.10 A05 BIOS Version : DELL - 15 BIOS Date : 05/09/2008 SM BIOS Caption : A05 SM BIOS Version : 2.5 Mother Vendor : Dell Inc. Mother Product : 0TP412 Mother Version : System Vendor : Dell Inc. System Model : Precision WorkStation T3400 System Information by DMI (Desktop Management Interface) SM BIOS Version : 2.5 BIOS Vendor : Dell Inc. BIOS Version : A05 BIOS Date : 05/09/2008 BIOS ROM Size : 1024 KB Mother Vendor : Dell Inc. Mother Product : 0TP412 Mother Version : CPU Socket : CPU CPU Vendor : Intel CPU Version : CPU Current Clock : 3000 MHz External Clock : 1333 MHz Max Clock : 5200 MHz PCI Device Information No Bus Dev Fnc VendorDeviceRev Class DeviceType 0 0 0 0 8086h:29E0h.00h 06h Host Bridge 1 0 1 0 8086h:29E1h.00h 060400h PCI to PCI Bridge 2
[HACKERS] Strange query behavior where clause produces odd behavior on '' query
The following query: SELECT * FROM Customers_connxstore where customerid 'specd' Returns the row containing Customers_connxstore.customerid == 'SPECD' I would expect to get that row if the query was: SELECT * FROM Customers_connxstore where customerid = 'specd' The other operators (, =, , =, !=) all work as expected. Sample file to reproduce the problem: http://cap.connx.com/bugreport/pgbug.sql.bz2 Is this a known issue with PostgreSQL or for some reason the desired behavior? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange query behavior where clause produces odd behavior on '' query
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Thursday, October 30, 2008 5:31 PM To: Dann Corbit Cc: [HACKERS]; Sherry Griffin Subject: Re: [HACKERS] Strange query behavior where clause produces odd behavior on '' query Dann Corbit [EMAIL PROTECTED] writes: The following query: SELECT * FROM Customers_connxstore where customerid 'specd' Returns the row containing Customers_connxstore.customerid == 'SPECD' What encoding/locale are you using? Whatever the default encoding/locale is. I did not define any custom encoding, locale, or collating sequence. And while I'm asking, which PG version? All versions from PostgreSQL 7.1.3 to PostgreSQL 8.3.3, compiled by Visual C++ build 1400 seem to display this behavior. Are you unable to reproduce it? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange query behavior where clause produces odd behavior on '' query
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Thursday, October 30, 2008 6:34 PM To: Dann Corbit Cc: [HACKERS]; Sherry Griffin Subject: Re: [HACKERS] Strange query behavior where clause produces odd behavior on '' query Dann Corbit [EMAIL PROTECTED] writes: What encoding/locale are you using? Whatever the default encoding/locale is. Whatever is the wrong answer here. I just finished verifying that the sort order you're complaining about is the expected ordering in some locales. I suggest that you take the trouble to find out. English (United States) is my locale. Are you unable to reproduce it? Well, I see this on a Fedora machine: $ cat foo specd SPECD $ sort foo SPECD specd $ LANG=en_US sort foo specd SPECD $ The compare works as it should. The only bug was in my understanding. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!
Hi Simon, He is going to do some investigation in the methods and write down the possibilities and then he is going to implement something from that for PostgreSQL. When will this work be complete? We are days away from completing main work on 8.4, so you won't get much discussion on this for a few months yet. Will it be complete in time for 8.5? Or much earlier even? The first guess is that the work will be done for 8.6. Dano is supposed to finish the work and defend his thesis in something a bit more than a year. Julius, you don't mention what your role is in this. In what sense is Dano's master's thesis a we thing? I am Dano's mentor and we have a closed contact with Zdenek as well. We would like the project to become a we thing as another reason why to work on the project. It seems to be better to research some ideas at the begging and discuss the stuff during development than just individually writing some piece of code which could be published afterwards. Especially, when this area seems to be of interest of more people. Threads are where future performance is going to come from: General purpose- http://www.setup32.com/hardware/cpuchipset/32core-processors-intel-reach e.php GPU- http://wwwx.cs.unc.edu/~lastra/Research/GPU_performance.html http://www.cs.unc.edu/~geom/GPUSORT/results.html Database engines that want to exploit the ultimate in performance will utilize multiple threads of execution. True, the same thing can be realized by multiple processes, but a process is more expensive than a thread. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Plugin system like Firefox
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Bruce Momjian Sent: Tuesday, August 12, 2008 9:04 PM To: Robert Haas Cc: Markus Wanner; Matthew T. O'Connor; Tom Lane; Greg Sabino Mullane; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Plugin system like Firefox Robert Haas wrote: .. an OS-agnostic way of installing packages. Uh.. I don't think such a thing exists. Seems to in Firefox. I think nearly all Firefox extensions are written in XML and Javascript. It's quite a bit easier to have an OS-agnostic way of installing packages when the packages are platform-independent. I realize many prefer their OS-native packaging system, but that isn't the target audience of a packaging system that will increase adoption. Really? Who is it? People who want to try a database and its plugins to see if they like it. For what it's worth, I am rather fond of the firefox plugin for sqlite. Now, SQLite has very little in the way of administrative tools, so that tilts the field, but I guess other people might toy with PostgreSQL if a plugin were available and might not otherwise just because one is available. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Whence cometh the data in src/test/regress/data/streets.data ?
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Friday, July 25, 2008 12:52 PM To: Bjorn Munch Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Whence cometh the data in src/test/regress/data/streets.data ? Bjorn Munch [EMAIL PROTECTED] writes: This file was checked in way back in July 1996, by Marc G. Fournier but that doesn't mean he was the one who got the data from somewhere. Does anyone know where it comes from? Or has this information been lost in the mist of time? It's in the postgres v4r2 tarball with a file date of 1993-01-08, which means nobody around the current project has a clue. IANAL either, but I think it'd be fairly hard for anyone to assert a copyright on it given that it's a compilation of publicly available facts, and is surely not in the same format the information was originally published in anyhow. Probably something from here: http://www.freegis.org/database/?cat=1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Unsigned integer support.
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Friday, July 25, 2008 12:32 PM To: Ryan Bradetich Cc: Gregory Stark; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [RFC] Unsigned integer support. Ryan Bradetich [EMAIL PROTECTED] writes: On Fri, Jul 25, 2008 at 3:57 AM, Gregory Stark [EMAIL PROTECTED] wrote: Ryan Bradetich [EMAIL PROTECTED] writes: My plans for the example above would be: 1. SELECT 15 + 15 -- Throws overflow error. 2. SELECT 15::uint4 + 15 -- Returns 30::uint4. You could make it work by having a uint4+int4 operator which returns uint4 but then you're going to need a *lot* of operators This was my plan. Like he says, it's a *lot* of operators, and the point doesn't seem entirely clear to me. You'll still have overflow cases, they'll just be in different places. Consider the idea of not having any uint4-specific arithmetic operators, but instead providing the following: * assignment casts from int4 and int8 to uint4 (these throw error if out of range, of course) * implicit cast from uint4 to int8 (can never fail) The effect of providing the latter cast would be that any arithmetic involving a uint4 column would automatically be done in int8. Which would make it a shade slower than a native implementation, but probably not enough slower to be a problem --- and you'd avoid having to write dozens of operators and underlying support functions. Storing into the uint4 column would work fine with no extra notation because of the assignment casts. Moreover, you'd avoid cluttering the system with a pile of cross-type operators, which we have recently realized are not a good thing, because they increase the likelihood of ambiguous operator problems --- see http://archives.postgresql.org/pgsql-hackers/2008-06/msg00750.php For uint8 you'd have to promote to numeric to guarantee no failure in the implicit cast; which is going to be a rather bigger performance hit, but I don't really see uint8 as being a type with huge demand. Now you probably *will* want cross-type comparison operators, if you are going to support indexing of unsigned columns, so that something like uint4col 42 can be indexed without any casting. But limiting yourself to the six basic comparison operators certainly makes it a much less bulky project. At the cost of one bit of storage, you have compatible types using CREATE DOMAIN: CREATE DOMAIN name [ AS ] data_type [ DEFAULT expression ] [ constraint [ ... ] ] where constraint is: [ CONSTRAINT constraint_name ] { NOT NULL | NULL | CHECK (expression) } More specifically: CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE 0); CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE 0); CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE 0); CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE 0); Seems like a heck of a lot less work to me. Not to mention very easy to use. C:\Program Files (x86)\PostgreSQL\8.3\binpsql -h localhost -U postgres domaintest Password for user postgres: Welcome to psql 8.3.3, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit Warning: Console code page (437) differs from Windows code page (1252) 8-bit characters might not work correctly. See psql reference page Notes for Windows users for details. domaintest=# CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE 0); CREATE DOMAIN domaintest=# CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE 0); CREATE DOMAIN domaintest=# CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE 0); CREATE DOMAIN domaintest=# CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE 0); CREATE DOMAIN domaintest=# domaintest=# create table integer_types ( domaintest(# usCol usmallint, domaintest(# sCol smallint, domaintest(# uiCol uinteger, domaintest(# iCol integer, domaintest(# ubCol ubigint, domaintest(# bCol bigint, domaintest(# unCol unumeric, domaintest(# nCol numeric domaintest(# ); CREATE TABLE domaintest=# create index i1 on integer_types(usCol); CREATE INDEX domaintest=# create index i2 on integer_types(sCol); CREATE INDEX domaintest=# create index i3 on integer_types(uiCol); CREATE INDEX domaintest=# create index i4 on integer_types(iCol); CREATE INDEX domaintest=# create index i5 on integer_types(ubCol); CREATE INDEX domaintest=# create index i6 on integer_types(bCol); CREATE INDEX domaintest=# create index i7 on integer_types(unCol); CREATE INDEX domaintest=# create index i8 on integer_types(nCol); CREATE INDEX domaintest=# insert into integer_types values(1,1,1,1,1,1,1,1); INSERT 0 1 domaintest=# select * from integer_types; uscol | scol | uicol | icol | ubcol | bcol | uncol | ncol
Re: [HACKERS] [RFC] Unsigned integer support.
-Original Message- From: Andrew Dunstan [mailto:[EMAIL PROTECTED] Sent: Friday, July 25, 2008 1:11 PM To: Dann Corbit Cc: Tom Lane; Ryan Bradetich; Gregory Stark; pgsql- [EMAIL PROTECTED] Subject: Re: [HACKERS] [RFC] Unsigned integer support. Dann Corbit wrote: CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE 0); CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE 0); CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE 0); CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE 0); s//=/g I turned off the default option to remove extra line breaks. Future posts should not be quite as even and bletcherous. God willing, and the crick don't rise. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Unsigned integer support.
-Original Message- From: Andrew Dunstan [mailto:[EMAIL PROTECTED] Sent: Friday, July 25, 2008 1:28 PM To: Dann Corbit Cc: Tom Lane; Ryan Bradetich; Gregory Stark; pgsql- [EMAIL PROTECTED] Subject: Re: [HACKERS] [RFC] Unsigned integer support. Dann Corbit wrote: -Original Message- From: Andrew Dunstan [mailto:[EMAIL PROTECTED] Sent: Friday, July 25, 2008 1:11 PM To: Dann Corbit Cc: Tom Lane; Ryan Bradetich; Gregory Stark; pgsql- [EMAIL PROTECTED] Subject: Re: [HACKERS] [RFC] Unsigned integer support. Dann Corbit wrote: CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE 0); CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE 0); CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE 0); CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE 0); s//=/g I turned off the default option to remove extra line breaks. Future posts should not be quite as even and bletcherous. God willing, and the crick don't rise. I suspect you're missing my point, namely that 0 should be an allowed value for unsigned types. Quite right. The domains I created were really the 'natural numbers' rather than unsigned types. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Research/Implementation of Nested Loop Join optimization
When you install the source tree (e.g. in folder \postgresql-8.3.x) you will want to examine nodeMergejoin.c typically found in a path similar to this: \postgresql-8.3.x\src\backend\executor\nodeMergejoin.c Here are the comments from the version on my machine: /* * INTERFACE ROUTINES *ExecMergeJoin mergejoin outer and inner relations. *ExecInitMergeJoin creates and initializes run time states *ExecEndMergeJoin cleans up the node. * * NOTES * *Merge-join is done by joining the inner and outer tuples satisfying *join clauses of the form ((= outerKey innerKey) ...). *The join clause list is provided by the query planner and may contain *more than one (= outerKey innerKey) clause (for composite sort key). * *However, the query executor needs to know whether an outer *tuple is greater/smaller than an inner tuple so that it can *synchronize the two relations. For example, consider the following *relations: * *outer: (0 ^1 1 2 5 5 5 6 6 7)current tuple: 1 *inner: (1 ^3 5 5 5 5 6) current tuple: 3 * *To continue the merge-join, the executor needs to scan both inner *and outer relations till the matching tuples 5. It needs to know *that currently inner tuple 3 is greater than outer tuple 1 and *therefore it should scan the outer relation first to find a *matching tuple and so on. * *Therefore, rather than directly executing the merge join clauses, *we evaluate the left and right key expressions separately and then *compare the columns one at a time (see MJCompare). The planner *passes us enough information about the sort ordering of the inputs *to allow us to determine how to make the comparison. We may use the *appropriate btree comparison function, since Postgres' only notion *of ordering is specified by btree opfamilies. * * *Consider the above relations and suppose that the executor has *just joined the first outer 5 with the last inner 5. The *next step is of course to join the second outer 5 with all *the inner 5's. This requires repositioning the inner cursor *to point at the first inner 5. This is done by marking the *first inner 5 so we can restore the cursor to it before joining *with the second outer 5. The access method interface provides *routines to mark and restore to a tuple. * * *Essential operation of the merge join algorithm is as follows: * *Join { *get initial outer and inner tuples INITIALIZE *do forever { *while (outer != inner) { SKIP_TEST * if (outer inner) * advance outer SKIPOUTER_ADVANCE * else * advance inner SKIPINNER_ADVANCE *} *mark inner position SKIP_TEST *do forever { * while (outer == inner) { * join tuples JOINTUPLES * advance inner position NEXTINNER * } * advance outer position NEXTOUTER * if (outer == mark) TESTOUTER * restore inner position to mark TESTOUTER * else * break// return to top of outer loop *} *} *} * *The merge join operation is coded in the fashion *of a state machine. At each state, we do something and then *proceed to another state. This state is stored in the node's *execution state information and is preserved across calls to *ExecMergeJoin. -cim 10/31/89 */ From: [EMAIL PROTECTED] [mailto:[EMAIL
Re: [HACKERS] Research/Implementation of Nested Loop Join optimization
From: Manoel Henrique [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 23, 2008 1:47 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Research/Implementation of Nested Loop Join optimization The nodeMergejoin.c is the code for the Merge Join isn`t it? I am trying to find a way to change the Nested Loop Join, It would be more like on nodeNestloop.c when rescanning the inner plan, (second time scanning the inner plan and so on) he`d change the scan direction, If the scan direction was from first tuple to last tuple it would go backwards, if it was from last to first it would go forward... The code I`m looking atm is from 8.3.1 , seems to have some kind of direction manager but doesn`t seems to be in use. You are right. You want file: nodeNestloop.c
Re: [HACKERS] [PATCH]-hash index improving
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Xiao Meng Sent: Tuesday, July 22, 2008 7:57 PM To: Simon Riggs Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [PATCH]-hash index improving Well, I'll do it after I finish my second patch. Hash index should be more efficient than btree when N is big enough. It seems meaningful to find how big N is in an experiment way. The savings will depend on many factors. Another thing (besides volume) which is important is the sort of key data being indexed. Consider a unique key on six varchar(40) fields: 1. Country 2. State/Province 3. City 4. Division 5. Branch 6. Office Typically, a key such as this will have lots of clumps of similar data, only being differentiated with the final segment. This sort of index is often used for reporting purposes. To determine a unique entry, it is not unlikely that more than 200 characters will be traversed. A hash index gets a special boost here because a much smaller data signature is stored. Even a 64 bit hash occupies only 8 bytes. On the other hand, consider an index on a field consisting of a single character. Here, the pages of the b-tree will have a huge volume of entries per page, requiring fewer pages to search, and the hash index is many times larger and hence more pages will have to be loaded. These things make a hash index desirable: 1. Unique index 2. Long keys 3. Huge data cardinality 4. Equality search These things make a hash index undesirable: 1. Non-unique index 2. Short keys 3. Small data sets These things render a hash index as worthless (except in COMBINATION with a b-tree type index): 1. Need for range searches like BETWEEN 2. Need for ORDER BY on the column(s) As an aside: I guess it will also be nice if you can CLUSTER both index and data values on the hash. It may need a different algorithm than a b-tree clustering concept. I know that Rdb uses different kinds of storage areas for hashed indexes verses b-tree indexes. This effort to create hashed indexes is very valuable. Because it becomes more and more dominant as the data scales up, right at the point when things get tougher is when it becomes the most helpful. If you have a tiny table, it does not even matter if you index it, because (for instance) 10 rows will probably always stay in memory and iteration will find what is needed instantly. But if you have hundreds of millions of rows or billions of rows, now is when performance really matters. So when the data scales to preposterous size (which it has an uncanny ability to do) the boost of performance becomes even more valuable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH]-hash index improving
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Simon Riggs Sent: Thursday, July 17, 2008 4:10 PM To: Jonah H. Harris Cc: Xiao Meng; pgsql-hackers@postgresql.org; Kenneth Marshall Subject: Re: [HACKERS] [PATCH]-hash index improving On Thu, 2008-07-17 at 16:24 -0400, Jonah H. Harris wrote: I'm not really seeing any performance improvements over b-tree. I'd like to see a theoretical analysis on the benefit case before we spend too many teraflops on investigation. In which cases do we expect that hash indexes will beat btrees? Large table unique index equality search should be very fast with hashed index (and the only place where any advantage will be seen). Hashed indexes are useless for any search besides equality and gain more and more when the levels of the b-tree index increase. Here is a hash index lookup that is frightfully fast: http://www.corpit.ru/mjt/tinycdb.html It's a constant database, but the file format might be worth examination. Here is a quickie definition of the CDB format: http://cr.yp.to/cdb/cdb.txt I think that the problem with hashed indexes tends to be the blocking of index pages on disk. For instance, the FastDB/GigaBase author was able to make FastDB memory based hash indexes that are faster than the memory based b-tree versions, but not for the disk based GigaBase database. The only way to get better performance from hash based indexes is to read fewer index pages than if a tree-based index were used. So I think that the scheme used to create the index pages is the focus to make them worthwhile. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TIMESTAMP and daylight savings time question
We figured the problem out. Our older version did not have the OS patch: http://www.postgresqlforums.com/wiki/2007_Daylight_Savings_Time From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dann Corbit Sent: Wednesday, March 12, 2008 6:06 PM To: pgsql-hackers@postgresql.org Cc: Larry McGhaw Subject: [HACKERS] TIMESTAMP and daylight savings time question It appears that the older versions of PostgreSQL (7.x) do not consider the daylight savings time when using TIMESTAMP WITH TIMEZONE but the most recent versions do (8.x). Can someone tell me the exact PostgreSQL version number where the behavior changed? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] TIMESTAMP and daylight savings time question
It appears that the older versions of PostgreSQL (7.x) do not consider the daylight savings time when using TIMESTAMP WITH TIMEZONE but the most recent versions do (8.x). Can someone tell me the exact PostgreSQL version number where the behavior changed?
Re: [HACKERS] bug in numeric_power() function
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Richard Wang Sent: Tuesday, March 11, 2008 7:38 PM To: pgsql-hackers@postgresql.org Subject: [HACKERS] bug in numeric_power() function I expected 0 ^ 123.3 to be 0, but it reported error as follows postgres=# select 0 ^ 123.3; ERROR: cannot take logarithm of zero I find that there is a bug in numeric_power() function the function caculates a ^ b based on the algorithm e ^ (lna * b) as you see, ln0 is not valid It seems an obvious work-around that: if (b == 0) return 1; if (a == 0) return 0; could be inserted at the top. Aside: Having the ^ operator overloaded for exponentiation surprises me. Does it really have the right precedence for performing exponentiation? (E.g. you cannot do this in C++ because ^ will have the precedence of xor, which is wrong). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Will PostgreSQL get ported to CUDA?
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Christopher Browne Sent: Wednesday, January 30, 2008 9:56 AM To: pgsql-hackers@postgresql.org Subject: [HACKERS] Will PostgreSQL get ported to CUDA? 2008/1/30 Dann Corbit [EMAIL PROTECTED]: http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~IS SU E~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html http://www.nvidia.com/object/cuda_learn.html http://www.nvidia.com/object/cuda_get.html Someone at CMU has tried this, somewhat fruitfully. http://www.andrew.cmu.edu/user/ngm/15-823/project/Draft.pdf http://www.andrew.cmu.edu/user/ngm/15-823/project/Final.pdf This was based on GPUSort: http://gamma.cs.unc.edu/GPUSORT/ Unfortunately, the licensing of GPUSort is, if anything, more awful than that for CUDA. http://gamma.cs.unc.edu/GPUSORT/terms.html This would need to get pretty totally reimplemented to be useful with PostgreSQL. Happily, we actually have some evidence that the exercise would be of some value. Further, it looks to me like the implementation that was done was done in a pretty naive way. Something done more seriously would likely be radically better... It's too bad that they have a restrictive license. Perhaps there is an opportunity to create an information appliance that contains a special build of PostgreSQL, a nice heap of super-speedy disk, and a big pile of GPUs for sort and merge type operations. The thing that seems nice to me about this idea is that you would have a very stable test platform (all hardware and software combinations would be known and thoroughly tested) and you might also get some extreme performance. I guess that a better sort than GPUSort could be written from scratch, but legal entanglements with the use of the graphics cards may make the whole concept DOA. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Will PostgreSQL get ported to CUDA?
http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~IS SUE~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html http://www.nvidia.com/object/cuda_learn.html http://www.nvidia.com/object/cuda_get.html
Re: [HACKERS] operator suggest interval / interval = numeric
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Wednesday, January 09, 2008 10:00 PM To: Warren Turkal Cc: Brendan Jurd; Ilya А. Кovalenko; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] operator suggest interval / interval = numeric Warren Turkal [EMAIL PROTECTED] writes: YEAR TO MONTH and DAY TO {HOUR,MINUTE,SECOND} intervals should not combine. PostgreSQL correctly doesn't allow {YEAR,MONTH} TO {DAY,HOUR,MINUTE,SECOND} intervals, Really? I think you've confused some unimplemented decorative syntax with what the underlying datatype will or won't do. This is inconsistent. I would like to ultimately not allow operations on interval combinations that are not allowed by the SQL standard. The spec's approach to datetime operations in general is almost totally brain-dead, and so you won't find a lot of support around here for hewing to the straight-and-narrow-spec-compliance approach. If they have not even heard of daylight-savings time, how can anyone credit them with any meaningful contact with the real world? We'll cite the spec where it suits us, but in this area the spec says you can't do that carries very little weight. Or were you planning to lobby for removal of our DST support, too? Don't forget indexes. The standard does not breathe a word about them. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Sorting Improvements for 8.4
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Brian Hurt Sent: Thursday, December 20, 2007 6:42 AM To: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Sorting Improvements for 8.4 While we're blue skying things, I've had an idea for a sorting algorithm kicking around for a couple of years that might be interesting. It's a variation on heapsort to make it significantly more block-friendly. I have no idea if the idea would work, or how well it'd work, but it might be worthwhile kicking around. Now, the core idea of heapsort is that the array is put into heap order- basically, that a[i] = a[2i+1] and a[i] = a[2i+2] (doing the 0-based array version here). The problem is that, assuming that the length of a is larger than memory, then a[2i+1] is likely going to be on a different page or block than a[i]. That means every time you have to bubble down a new element, you end up reading O(log N) blocks- this is *per element*. The variation is to instead work with blocks, so you have a block of entries b[i], and you change the definition of heap order, so that min(b[i]) = max(b[2i+1]) and min(b[i]) = max(b[2i+2]). Also, during bubble down, you need to be carefull to only change the minimum value of one of the two child blocks b[2i+1] and b[2i+2]. Other than that, the algorithm works as normal. The advantage of doing it this way is that while each bubble down still takes O(log N) blocks being touched, you get a entire block worth of results for your effort. Make your blocks large enough (say, 1/4 the size of workmem) and you greatly reduce N, the number of blocks you have to deal with, and get much better I/O (when you're reading, you're reading megabytes at a shot). Now, there are boatloads of complexities I'm glossing over here. This is more of a sketch of the idea. But it's something to consider. It's an interesting idea to work with a heap of heaps where you try to keep each heap page-sized. It reminds me of the B+ tree, where you collect a whole bunch of nodes into a single page. I don't know if you have examined weak-heaps, but there are some interesting results for weak-heap approaches. As you know, heapsort variants do not degenerate to O(N^2). On this link: http://www.jea.acm.org/2002/EdelkampHeapsort/ I highly recommend all the goodies he has embedded (papers, source, etc.) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Sorting Improvements for 8.4
As long as sorting improvements are being considered, may I suggest an experiment that uses a very simple model? Assuming that you have K subfiles created by the initial sorting pass, insert the top record of each file into a priority queue. Then, emit records from the queue until the priority queue is empty. Now, there will be the objection that we will be jumping willy-nilly all over the disk because of reading one record at a time, but (depending on how it is implemented) generally several records are buffered during a read. So (as a gentle suggestion) I suggest testing the model. It works great for a single CPU or multiple CPU system for the work that *I* do. I have no idea if it will be a benefit for PostgreSQL or not, but it should be a very simple matter to try it. As long as someone is doing the work right now, it would be a good time to give it a go. I am not very familiar with PostgreSQL internals, but I would be willing to give a hand with it (not really sure how much time I can guarantee, though, since I would be doing it on my free time). ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Sorting Improvements for 8.4
-Original Message- From: Jeff Davis [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 19, 2007 3:10 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Sorting Improvements for 8.4 On Wed, 2007-12-19 at 14:41 -0800, Dann Corbit wrote: As long as sorting improvements are being considered, may I suggest an experiment that uses a very simple model? Assuming that you have K subfiles created by the initial sorting pass, insert the top record of each file into a priority queue. Then, emit records from the queue until the priority queue is empty. What is the principle difference between that idea and our existing sort algorithm? There's a good explanation in the comment at the top of tuplesort.c. According to the comments, PostgreSQL uses replacement selection. Replacement selection is a wonderful thing because it creates runs that are twice as long as normal due to the snowplow effect. See (for instance): http://ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/69/27216/01209012. pdf Then, the merge routine will have half as many runs to merge the files together. So (for instance) without replacement selection, if you create 1024 subfiles, then replacement selection will create 512. That saves one merge pass. The algorithm that I am suggesting will take exactly one pass to merge all of the files. It works like this... Imagine an array of pointers to the subfiles: [*subfile][*subfile]...[*subfile] Step 0: We sort the array by a comparison operator that examines the top element of each subfile. So now the array is ordered such that the record with the smallest key is in array slot 0. Step 1: We remove the first record from the subfile in array slot 0. Now, the priority of the first element *may* have changed. So if it is no longer smaller than the subfile immediately to the right, we do a binary insertion to put this subfile in its new location, moving the contents of array slot[1] to array slot 0 if it is needed. Step 2: Is the entire list of subfiles empty? If yes, then terminate, if no then go to Step 1. Like I said, it is ultra-simple and it sorts the entire contents of all subfiles to the output with a single pass. Consider the way that current replacement selection works. The actual O(f(N)) behavior of replacement selection is just terrible O(n^2). But because we save one full merge pass, it is usually worth it anyway, since memory access is much faster than disk. And if we only have a few subfiles, the savings will be large. In the case of a priority queue merge, we only have one single merge pass no matter how many subfiles there are. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Sorting Improvements for 8.4
P.S. A beautiful paper on replacement selection is found here: http://students.fim.uni-passau.de/~fickensc/Proseminar/Proseminar.pdf ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Sorting Improvements for 8.4
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Gregory Stark Sent: Wednesday, December 19, 2007 5:26 PM To: Jeff Davis Cc: Mark Mielke; Michał Zaborowski; Simon Riggs; Ron Mayer; pgsql- [EMAIL PROTECTED] Subject: Re: [HACKERS] Sorting Improvements for 8.4 Jeff Davis [EMAIL PROTECTED] writes: test= explain analyze select * from sorter order by t; test= explain analyze select * from sorter order by b; test= explain analyze select * from sorter order by f; On my machine this table fits easily in memory (so there aren't any disk reads at all). Sorting takes 7 seconds for floats, 9 seconds for binary data, and 20 seconds for localized text. That's much longer than it would take to read that data from disk, since it's only 70MB (which takes a fraction of a second on my machine). I think this disproves your hypothesis that sorting happens at disk speed. I suspect most of that is spent just copying the data around. Which would not be helped by having multiple threads doing the copying -- and in fact might be exacerbated if it required an extra copy to consolidate all the data in the end. Benchmarking a single system will really only explain that system. Someone may have a disk farm with 2GB/Sec throughput: http://www.sql-server-performance.com/articles/per/system_storage_configuration_p1.aspx But such a configuration is very unlikely. Someone may have 10GB/S NIC cards, but those too, are rare. So for any benchmark, we will really just end up with a number for that system. Typically, disk is the bottleneck. I found this on the net somewhere, but it's quite a useful table for capacity planning (to find the weak link in the chain using back of the envelope calculations): Interface Width Frequency Bytes/Sec Bits/Sec 4-way interleaved PC1600 (DDR200) SDRAM 4 x 64bits 100 MHz DDR 6.4 GB/s51 Gbps Opteron HyperTransport memory bus 128bits 200 MHz DDR 6.4 GB/s 51 Gbps Pentium 4 800 MHz FSB 64bits 200 MHz QDR 6.4 GB/s51 Gbps PC2 6400 (DDR-II 800) SDRAM 64bits 400 MHz DDR 6.4 GB/s51 Gbps PC2 5300 (DDR-II 667) SDRAM 64bits 333 MHz DDR 5.3 GB/s43 Gbps Pentium 4 533 MHz FSB 64bits 133 MHz QDR 4.3 GB/s34 Gbps PC2 4300 (DDR-II 533) SDRAM 64bits 266 MHz DDR 4.3 GB/s34 Gbps 2-channel PC1066 RDRAM 2 x 16bits 533 MHz DDR 4.3 GB/s34 Gbps PCI-X 533 64bits 533 MHz 4.3 GB/s34 Gbps PCI-Express x16 serial/16lanes 2.5 GHz 4 GB/s 32 Gbps Pentium 4 400 MHz FSB 64bits 100 MHz QDR 3.2 GB/s25.6 Gbps 2-channel PC800 RDRAM 2 x 16bits 400 MHz DDR 3.2 GB/s25.6 Gbps 2-way interleaved PC1600 (DDR200) SDRAM 2 x 64bits 100 MHz DDR 3.2 GB/s25.6 Gbps PC2 3200 (DDR-II 400) SDRAM 64bits 200 MHz DDR 3.2 GB/s25.6 Gbps PC3200 (DDR400) SDRAM 64bits 200 MHz DDR 3.2 GB/s25.6 Gbps PC2700 (DDR333) SDRAM 64bits 167 MHz DDR 2.7 GB/s21 Gbps PC2100 (DDR266) SDRAM 64bits 133 MHz DDR 2.1 GB/s17 Gbps AGP 8x 32bits 533 MHz 2.1 GB/s17 Gbps PCI-X 266 64bits 266 MHz 2.1 GB/s17 Gbps PCI-Express x8 serial/8lanes 2.5 GHz 2 GB/s 16 Gbps EV6 bus (Athlon/Duron FSB) 64bits 100 MHz DDR 1.6 GB/s13 Gbps PC1600 (DDR200) SDRAM 64bits 100 MHz DDR 1.6 GB/s13 Gbps PC800 RDRAM 16bits 400 MHz DDR 1.6 GB/s13 Gbps PC150 SDRAM 64bits 150 MHz 1.3 GB/s10.2 Gbps 10 gigabit ethernet serial 10 GHz 1.25 GB/s 10 Gbps OC-192 serial 9.953 GHz 1.24 GB/s 9.953 Gbps 133 MHz FSB 64bits 133 MHz 1.06 GB/s 8.5 Gbps PC133 SDRAM 64bits 133 MHz 1.06 GB/s 8.5 Gbps AGP 4x 32bits 266 MHz 1.06 GB/s 8.5 Gbps PCI-X 64bits 133 MHz 1.06 GB/s 8.5 Gbps PCI-Express x4 serial/4lanes 2.5 GHz 1 GB/s 8 Gbps 100 MHz FSB 64bits 100 MHz 800 MB/s6.4 Gbps PC100 SDRAM 64bits 100 MHz 800 MB/s6.4 Gbps PC66 SDRAM 64bits 66 MHz 533 MB/s4.3 Gbps fast/wide PCI 64bits 66 MHz 533 MB/s4.3 Gbps AGP 2x 32bits 133 MHz 533 MB/s4.3 Gbps single-link DVI 12bits 165 MHz DDR 495 MB/s3.96 Gbps Ultra-320 SCSI 16bits 160 MHz 320 MB/s2.6 Gbps OC-48 network serial 2.488 GHz 311 MB/s2.488 Gbps AGP 32bits 66 MHz 266 MB/s2.1 Gbps PCI-Express x1 serial 2.5 GHz 250 MB/s2 Gbps Serial ATA/1500 diskserial 1.5 GHz 187 MB/s1.5 Gbps Ultra-160 SCSI 16bits 80 MHz 160 MB/s1.3 Gbps OC-24 network serial 1.244 GHz 155 MB/s1.244 Gbps PCI 32bits 33 MHz 133 MB/s1.06 Gbps ATA/133 disk8bits 66 MHz DDR 133 MB/s1.06 Gbps gigabit ethernetserial 1 GHz 125 MB/s1 Gbps ATA/100 disk8bits 50 MHz DDR 100 MB/s
Re: [HACKERS] Current code for function j2date does not have the same correct dynamic range as older code.
From: Dann Corbit Sent: Tuesday, December 11, 2007 5:58 PM To: pgsql-hackers@postgresql.org Subject: Current code for function j2date does not have the same correct dynamic range as older code. It may not matter to the PostgreSQL group, since nothing goes wrong until the year is 1,465,002 or larger. It may also be an artifact of the Microsoft Visual C++ compiler. At any rate, the modular math to compute the year month and day do not remain accurate nearly as long for the new code as the old. The old code here is j2dateOld(), and the new code is j2dateNew3(). I made a few tests with other internal types to try to extend the range of the new code, but neither long long or using mostly unsigned seems to restore the old range because the mathematics are not identical. At any rate, here is a unit test driver you can fiddle with, if you so choose. Correction: It is the new routines that are sound. The old routines had overflow problems. So *cough* nevermind. [snip of code] ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Dumb question about binary cursors and #ifdef HAVE_INT64_TIMESTAMP
If I create a binary cursor on a recent version of PostgreSQL, how can I tell if the timestamp data internally is an 8 byte double or an 8 byte integer? I see an #ifdef that changes the code path to compute timestamps as one type or the other, but I do not know how to recognize the internal format of the type that will be returned in a binary cursor. How can I do that?
Re: [HACKERS] [GENERAL] Dumb question about binary cursors and #ifdefHAVE_INT64_TIMESTAMP
-Original Message- From: Alvaro Herrera [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 11, 2007 1:11 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org; [EMAIL PROTECTED] Subject: Re: [GENERAL] Dumb question about binary cursors and #ifdefHAVE_INT64_TIMESTAMP Dann Corbit wrote: If I create a binary cursor on a recent version of PostgreSQL, how can I tell if the timestamp data internally is an 8 byte double or an 8 byte integer? I see an #ifdef that changes the code path to compute timestamps as one type or the other, but I do not know how to recognize the internal format of the type that will be returned in a binary cursor. How can I do that? SHOW integer_timestamp; (actually, IIRC, this is one of the params that the server will send you at session start). I guess that I am supposed to check for error on the statement? What does it look like when the query works? This is what I get against PostgreSQL 8.2.5 using PG Admin III query tool: ERROR: unrecognized configuration parameter integer_timestamp ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Dumb question about binary cursors and #ifdefHAVE_INT64_TIMESTAMP
-Original Message- From: Alvaro Herrera [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 11, 2007 1:11 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org; [EMAIL PROTECTED] Subject: Re: [GENERAL] Dumb question about binary cursors and #ifdefHAVE_INT64_TIMESTAMP Dann Corbit wrote: If I create a binary cursor on a recent version of PostgreSQL, how can I tell if the timestamp data internally is an 8 byte double or an 8 byte integer? I see an #ifdef that changes the code path to compute timestamps as one type or the other, but I do not know how to recognize the internal format of the type that will be returned in a binary cursor. How can I do that? SHOW integer_timestamp; (actually, IIRC, this is one of the params that the server will send you at session start). Tom's post clued me in. It's: show integer_datetimes; Or (in my case): PQparameterStatus(conn, integer_datetimes) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Selecting a constant question: A summary
First a comment: At CONNX Solutions Inc., we believe sincerely that we should do whatever is necessary to make our customers prosper. This means creation of excellent tools and being responsive to customer needs. Secondly, we believe that we should treat the customers the way that we want to be treated. I think that the PostgreSQL group has managed the first objective, but not the second. Of course, that is only an opinion, but I think that success hinges on both factors. Our objective in this issue has also been to improve PostgreSQL so that it can become more useful to the end users and not to denigrate the work of the engineers that have toiled on it. I will also admit that frustration has caused our tone to become sharp at times. This is clearly a mistake on our part and for this, I apologize. Next, the problem: According to SQL/CLI and ODBC 3.5, we should bind the length of a character column. Here are some references from the relevant documentation (SQL/CLI and ODBC are clones of one another): == ANSI/ISO/IEC 9075-3-1999 for Information Technology Database Language SQL Part 3: Call-Level Interface (SQL/CLI) Section 6.5 BindCol Along with function SQLBindCol from the ODBC specification http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/ht m/odbcsqlbindcol.asp This bit should be sufficient to explain what we are after: BufferLength [Input] Length of the *TargetValuePtr buffer in bytes. The driver uses BufferLength to avoid writing past the end of the *TargetValuePtr buffer when returning variable-length data, such as character or binary data. Note that the driver counts the null-termination character when returning character data to *TargetValuePtr. *TargetValuePtr must therefore contain space for the null-termination character or the driver will truncate the data. When the driver returns fixed-length data, such as an integer or a date structure, the driver ignores BufferLength and assumes the buffer is large enough to hold the data. It is therefore important for the application to allocate a large enough buffer for fixed-length data or the driver will write past the end of the buffer. SQLBindCol returns SQLSTATE HY090 (Invalid string or buffer length) when BufferLength is less than 0 but not when BufferLength is 0. However, if TargetType specifies a character type, an application should not set BufferLength to 0, because ISO CLI-compliant drivers return SQLSTATE HY090 (Invalid string or buffer length) in that case. == Now, there are times when (according to the spec) we have to defer binding. However, this causes great problems for end user tools and should only be done in what is basically a dire emergency. In the case of a SELECT query that selects a fixed constant of any sort, it would be a definite improvement for PostgreSQL to give some sort of upper maximum. For example: SELECT Eastern Division, sum(Inventory_level), sum(Inventory_backorder), Manager_last_name FROM table_name WHERE division_id = 9 GROUP BY Manager_last_name Will return 3 columns of data. The first column is of unknown length. Imagine if you are a spreadsheet in OpenOffice: http://www.openoffice.org/ which happens to support ODBC connections. You would like to fill out a report for the president of your company. Unfortunately, the first column is of unknown length That makes it a bit difficult to format this spreadsheet. Now, I will admit that we may not know a-priori if Eastern Division is character or Unicode or MBCS. But in the worst case scenario it will be (16 + 1) * element_width bytes in length. For some Unicode character sets, element_width can be as much as 4, so that leaves 68 octets as an upper possible maximum. Now, you might protest, 68 bytes might be much too large. That is true, but I know that if I allocate 68 bytes we will not have data truncation. It is no worse than a varchar(255) field that has a largest item 15 characters wide in it. The grid will successfully bind and we will be able to produce the report. Generally speaking, grids are smart enough to automatically resize themselves to max_length(grid_column_title, grid_column_data) and so the report will look very nice. It is also true that it is possible for us to work around the problem. We certainly can know the exact type information about the constants in our queries and reformat the PostgreSQL queries to decorate them with things like: SELECT Eastern Division::char(16), sum(Inventory_level)::Numeric(16,4), sum(Inventory_backorder) ::Numeric(16,4), Manager_last_name FROM table_name WHERE division_id = 9 GROUP BY Manager_last_name But it would be very nice if the database could provide a good estimate for us so that PostgreSQL could work like all of the other database systems. Code full of
Re: [HACKERS] Selecting a constant question
-Original Message- From: Larry McGhaw Sent: Tuesday, June 12, 2007 1:40 PM To: Martijn van Oosterhout Cc: Andrew Dunstan; Hannu Krosing; Tom Lane; Alvaro Herrera; Dann Corbit; Gregory Stark; pgsql-hackers@postgresql.org Subject: RE: [HACKERS] Selecting a constant question For what its worth .. Your statement about why we are the first people to mention this problem really got me thinking. Anyone who would attempt to write an ODBC driver for Postgres would run into the exact same issue. So I installed the official Postgres ODBC driver, and ran the identical query and here are my results: I probably should have looked at this first There is a whole Postgres ODBC dialog dedicated to the very subject of this thread: Handling of unknown data sizes. The pgodbc driver is configured to treat unknowns as varchar(255) by default, As shown by my example below. This can be configured up or down as desired. SQLExecDirect: In: hstmt = 0x003C18E0, szSqlStr = Select a,b,c, '123' , '123'::char(3), '123'::varchar(3) from..., cbSqlStr = -3 Return: SQL_SUCCESS=0 Describe Column All: icol, szColName, *pcbColName, *pfSqlType, *pcbColDef, *pibScale, *pfNullable 1, a, 1, SQL_VARCHAR=12, 20, 0, SQL_NULLABLE=1 2, b, 1, SQL_CHAR=1, 10, 0, SQL_NULLABLE=1 3, c, 1, SQL_INTEGER=4, 10, 0, SQL_NULLABLE=1 4, ?column?, 8, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1 5, bpchar, 6, SQL_CHAR=1, 3, 0, SQL_NULLABLE=1 6, varchar, 7, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1 From psqlodbc.h #define MAX_VARCHAR_SIZE 255 /* default maximum size of * varchar fields (not including null term) */ So I guess the bottom line is that we are not the first to encounter this problem .. Its just been covered up by assigning An arbitrary maximum size .. So I guess we will do the same and make it configurable like the official postgres driver. Of course, the downside here is that choosing a default will truncate the data when the actual data is larger than the default chosen. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Selecting a constant question
SELECT 1 FROM test.dbo.a_003 gets about 60,000 records per second SELECT '1' FROM test.dbo.a_003 gets about 600 records per second. The cause is that postgres describes the return column as unknown length 65534 in the 2nd case. Since the value is a constant, it seems rather odd to make the length 65534 characters. Why not make it char(1) or some other appropriate and less costly data type? After all, it isn't going to grow during the query.
Re: [HACKERS] Selecting a constant question
-Original Message- From: Gregory Stark [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 12:48 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Dann Corbit [EMAIL PROTECTED] writes: SELECT 1 FROM test.dbo.a_003 gets about 60,000 records per second SELECT '1' FROM test.dbo.a_003 gets about 600 records per second. The cause is that postgres describes the return column as unknown length 65534 in the 2nd case. Wait, back up. How does this cause it to go slower? The issue is this: Postgres describes the column with a typmod of -1 (unknown) and a length of 65534. This means that any client application must reserve 65534 bytes of spaces for every row of data (like a grid control for example), which postgres should know (and report) that the maximum length of the column is 1. This is not a PSQL issue, it's an issue with other products relying on the accuracy of the reported postgres metadata for a given SQL statement. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 1:32 PM To: Dann Corbit Cc: Gregory Stark; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Dann Corbit [EMAIL PROTECTED] writes: The issue is this: Postgres describes the column with a typmod of -1 (unknown) and a length of 65534. Oh, you're looking at typlen not typmod. Please observe the comments in pg_type.h: /* * For a fixed-size type, typlen is the number of bytes we use to * represent a value of this type, e.g. 4 for an int4. But for a * variable-length type, typlen is negative. We use -1 to indicate a * varlena type (one that has a length word), -2 to indicate a * null-terminated C string. */ int2typlen; You should be treating typlen as signed not unsigned, and not assuming a fixed width for any negative value. Since the width refers to the server internal representation, and not to what comes down the wire, I find it pretty strange for an application to be using typlen for anything at all actually. Thanks for the response. Since libpq function PQfsize returns -2 for all constant character strings in SQL statements ... What is the proper procedure to determine the length of a constant character column after query execution but before fetching the first row of data? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question
-Original Message- From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 1:46 PM To: Dann Corbit Subject: Re: [HACKERS] Selecting a constant question On Mon, Jun 11, 2007 at 01:29:37PM -0700, Dann Corbit wrote: Our application is using the libPQ interface to access postgres. The query is select '123' from tablename .. the table is not important. After executing the query, we interrogate the metadata of the result set using the PQfsize, PQfmod and PQftype functions. Did you read the documentation of the PQfsize function? PQfsize returns the space allocated for this column in a database row, in other words the size of the server's internal representation of the data type. (Accordingly, it is not really very useful to clients.) A negative value indicates the data type is variable-length. http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html#LIBPQ-EXE C- SELECT-INFO The size of the column is returned as 65534 (or -2 if you consider this a signed short value) It's variable length, you can't say anything more. So what you are saying is that the constant '1' is variable length, and there is no way to find out the maximum length from the database. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Selecting a constant question
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Dann Corbit Sent: Monday, June 11, 2007 1:52 PM To: Martijn van Oosterhout Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question -Original Message- From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 1:46 PM To: Dann Corbit Subject: Re: [HACKERS] Selecting a constant question On Mon, Jun 11, 2007 at 01:29:37PM -0700, Dann Corbit wrote: Our application is using the libPQ interface to access postgres. The query is select '123' from tablename .. the table is not important. After executing the query, we interrogate the metadata of the result set using the PQfsize, PQfmod and PQftype functions. Did you read the documentation of the PQfsize function? PQfsize returns the space allocated for this column in a database row, in other words the size of the server's internal representation of the data type. (Accordingly, it is not really very useful to clients.) A negative value indicates the data type is variable-length. http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html#LIBPQ-EXE C- SELECT-INFO The size of the column is returned as 65534 (or -2 if you consider this a signed short value) It's variable length, you can't say anything more. So what you are saying is that the constant '1' is variable length, and there is no way to find out the maximum length from the database. I have a PostgreSQL feature request: Report the maximum size of a variable length string from the server. Surely, we cannot be the only people who will need this information. If (for example) someone wants to bind to a grid, then the maximum size has to be known in advance. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Selecting a constant question
-Original Message- From: Gregory Stark [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 2:41 PM To: Dann Corbit Cc: Martijn van Oosterhout; pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question Dann Corbit [EMAIL PROTECTED] writes: Surely, we cannot be the only people who will need this information. If (for example) someone wants to bind to a grid, then the maximum size has to be known in advance. In fact psql needs it and implements this. It has to skim through the entire result set to calculate the column widths. It's quite a lot of work but the server is in no better position to do it than psql. Reading the data twice sounds a little painful. What if there are 30 million rows? On the contrary the server is missing quite a bit of information of how you intend to display the information. Do you need the number of bytes or characters? Are all the characters the same width in your display system? What about currency symbols? Do you intend to reverse any quoting or just display backslashes? Giving me the information about the data type will be enough. As an example, in this case we have varchar data. If the server should be so kind as to report varchar(1) for '1' or varchar(3) for '123' then I would not have any difficulty binding the data to a grid. Even knowing how many characters and assuming fixed character widths that wouldn't even be enough to set your grid control widths. Usually people like numeric quantities decimal aligned and so two records 1.00 and 0.01 will take much more width than two records with 1.00 and 2.00. SQL*Server, Oracle, Ingres, DB/2 and other database systems somehow manage to do it, so I guess it is not technically intractable. I suspect that your own ODBC/JDBC and other drivers suffer from this same effect. Now, I do recognize that sometimes nobody is going to know how big something is, including the server. But with a query using a constant it seems like it ought to be well defined to me. Perhaps the difficulties are escaping me because I am not familiar with the low level guts of this problem. But I suspect that lots of people besides me would benefit if sizes of things were known when it is possible to know them. As I said before, I see that it cannot be known right now. So I am putting it in as a feature request. If you could be so kind as to point out the right spot to look in the server code, I imagine we could fix it and check in the patch ourselves. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.
These two calls make our remote queries via libpq about twice as fast on average. It seems to me like it might be a nice addition to the core product's libpq (I poked it into the spot where the Nagle algorithm is turned off, but another place would be fine too). Can anyone give me a reason why it is a bad idea to add this in? If it were made a parameter with a default of 64K, that would be even better. Then it could be tuned to particular systems for maximum throughput. on = 65535; if (setsockopt(conn-sock, SOL_SOCKET, SO_RCVBUF,(char *) on, sizeof(on)) 0) { char sebuf[256]; printfPQExpBuffer(conn-errorMessage, libpq_gettext(could not set socket SO_RCVBUF window size: %s\n), SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf))); return 0; } on = 65535; if (setsockopt(conn-sock, SOL_SOCKET, SO_SNDBUF,(char *) on, sizeof(on)) 0) { char sebuf[256]; printfPQExpBuffer(conn-errorMessage, libpq_gettext(could not set socket SO_SNDBUF window size: %s\n), SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf))); return 0; }
Re: [HACKERS] Selecting a constant question
-Original Message- From: Alvaro Herrera [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 3:16 PM To: Dann Corbit Cc: Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question Dann Corbit wrote: Dann Corbit [EMAIL PROTECTED] writes: In fact psql needs it and implements this. It has to skim through the entire result set to calculate the column widths. It's quite a lot of work but the server is in no better position to do it than psql. Reading the data twice sounds a little painful. What if there are 30 million rows? You get an out of memory error. On the contrary the server is missing quite a bit of information of how you intend to display the information. Do you need the number of bytes or characters? Are all the characters the same width in your display system? What about currency symbols? Do you intend to reverse any quoting or just display backslashes? Giving me the information about the data type will be enough. As an example, in this case we have varchar data. If the server should be so kind as to report varchar(1) for '1' or varchar(3) for '123' then I would not have any difficulty binding the data to a grid. Oh, you have the length information for each datum all right. It's on the first four bytes of it. Sure, but when I bind to a grid, I need to know a-priori how big the biggest returned instance can be. Reading the entire data set twice to learn the size of a constant seems rather conceptually odd to me. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question
-Original Message- From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 3:29 PM To: Dann Corbit Cc: Alvaro Herrera; Gregory Stark; pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question On Mon, Jun 11, 2007 at 03:18:33PM -0700, Dann Corbit wrote: Sure, but when I bind to a grid, I need to know a-priori how big the biggest returned instance can be. Reading the entire data set twice to learn the size of a constant seems rather conceptually odd to me. To be honest, the concept that a widget requires a constant that can't be changed later is also a bit odd. Not when the data itself is a constant that cannot be changed. There are many times you won't know beforehand how big the data is, surely the framework should be smart enough to handle these cases? If it were impossible to know the size of a string constant supplied in the query, then I think I would agree with you here. However, it seems to me that the maximum possible size of such a known, constant-width string is not hard to determine. Start the width at 100, if it turns out to be too small, make it bigger... If that were a good idea, then why report data sizes at all? Just let it always be a surprise when it comes streaming down the pipe. Honestly, I cannot fathom this answer. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 3:41 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate. Dann Corbit [EMAIL PROTECTED] writes: These two calls make our remote queries via libpq about twice as fast on average. And, perhaps, cause even greater factors of degradation in other scenarios (not to mention the possibility of complete failure on some platforms). You haven't provided nearly enough evidence that this is a safe change to make. May I suggest: http://www-didc.lbl.gov/TCP-tuning/setsockopt.html http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html We test against dozens of operating systems and we have never had a problem (generally, we use our own tcp/ip network objects for communication and we only recently figured out why PostgreSQL was lagging so far behind and patched libPQ ourselves.) Now, it will be about 2 weeks before our full regressions have run against PostgreSQL on all of our platforms, but we do adjust the TCP/IP window on all of our clients and servers and have yet to find one that is unable to either negotiate a decent size or ignore our request at worst. However, I won't twist your arm. I just wanted to be sure that those at the PostgreSQL organization were aware of this simple trick. Our products run on: Aix BeOS Hpux Linux (everywhere, including mainframe zLinux) MVS SunOS Solaris OpenVMS Alpha OpenVMS VAX OpenVMS Itanium Windows And several others ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Selecting a constant question
-Original Message- From: Alvaro Herrera [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 3:44 PM To: Dann Corbit Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout; pgsql- [EMAIL PROTECTED]; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question Dann Corbit wrote: If the server bound the data as UNICODE, then it will tell me UNICODE(3). I know how big this will be. In the worst case scenario it will fit in 3*4 = 12 bytes. If the server is built without UNICODE enabled, then it will definitely fit in 3 bytes. Unless it's some other multibyte encoding. And nowadays, the server is always unicode enabled. The stuff sent down the wire is unicode or not depending on a configuration parameter. Even at that, we still know an absolute maximum of 12 bytes. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 3:50 PM To: Dann Corbit Cc: Martijn van Oosterhout; Alvaro Herrera; Gregory Stark; pgsql- [EMAIL PROTECTED]; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question Dann Corbit [EMAIL PROTECTED] writes: To be honest, the concept that a widget requires a constant that can't be changed later is also a bit odd. Not when the data itself is a constant that cannot be changed. Surely this case is not sufficiently important to justify designing your entire application (not to mention the client/server protocol) around it. You're always going to have variable-width columns in there somewhere. Right. But normally I get back a length for those variable length columns, or I can collect it from the metadata of the database. Surely, PostgreSQL can determine the size of a constant string. Otherwise it would be impossible to know if it would be safe to insert a constant string into a database column. PostgreSQL has decided upon a data type, and gives me data bound in that type. It is only the length that it is unwilling to divulge. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.
-Original Message- From: Greg Smith [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 4:09 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate. On Mon, 11 Jun 2007, Dann Corbit wrote: These two calls make our remote queries via libpq about twice as fast on average. Can you comment a bit on what your remote queries are typically doing? You'll need to provide at least an idea what type of test case you're seeing the improvement on for others to try and replicate it. We have literally thousands (maybe hundreds of thousands -- I'm not totally sure exactly how many there are because I am in development and not in testing) of queries, that take dozens of machines over a week to run. Our queries include inserts, updates, deletes, joins, views, um... You name it. Our product is database middleware and so we have to test against anything that is a legal SQL query against every sort of database and operating system combination (PostgreSQL is one of many environments that we support). If you have seen the NIST SQL verification suite, that is part of our test suite. We also found the PostgreSQL suite useful (though the PostgreSQL specific things we only run against PostgreSQL). We also have our own collection of regression tests that we have gathered over the past 20 years or so. I can't be specific because we run every sort of query. Most of our hardware is fairly high end (generally 1GB Ethernet, but we do have some machines that only have 100 MB net cards in them). I guess that our usage is atypical for general business use but fairly typical for those companies that produce middleware tool sets. However, many of our regressions came from customer feedback and so we do test lots and lots of valid customer requirements. I have a simple suggestion: Put the setsockopt calls in (with the necessary fluff to make it robust) and then perform the OSDB test. I guess that unless the OSDB houses the clients and the server on the same physical hardware you will see a very large bonus for a very simple change. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.
-Original Message- From: Andrew Dunstan [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 4:35 PM To: Dann Corbit Cc: Tom Lane; pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate. Dann Corbit wrote: However, I won't twist your arm. I just wanted to be sure that those at the PostgreSQL organization were aware of this simple trick. Our products run on: Aix BeOS Hpux Linux (everywhere, including mainframe zLinux) MVS SunOS Solaris OpenVMS Alpha OpenVMS VAX OpenVMS Itanium Windows And several others We already set the SNDBUF on Windows for reasons documented in the code. The only place I see it is for Windows *only* in PQCOMM.C (to 32K). Did I miss it somewhere else? I think if you were to quantify the alleged improvement by platform it might allay suspicion. I do not know if you will see the same results as we do. We support ancient and modern operating systems, on ancient and modern hardware (we have OpenVMS 6.1 running Rdb as old as 4.2, for instance -- 1980's technology). The only way for you to see if your environments have the same sort of benefits that we see is to test it yourselves. The TCP/IP window size is such a well known optimization setting (in fact the dominant one) that I am kind of surprised to be catching anyone unawares. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Selecting a constant question
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Kris Jurka Sent: Monday, June 11, 2007 5:04 PM To: Larry McGhaw Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question On Mon, 11 Jun 2007, Larry McGhaw wrote: I think perhaps we have lost sight of the main issue: 2) libpq can properly describe the maximum internal data size of any varchar column via Pqfmod SELECT cola || colb FROM tab; Suggestion: Return (column size of cola) + (column size of colb) in the maximum length field. 3) libpq can properly describe the maximum internal data size of any numeric constant in a SQL statement via Pqfsize SELECT 3::numeric; Suggestion: Return sizeof (numeric(1,0)) -- after all, it's a constant here. In the words of the great poet Spike Lee: 'Always do the right thing.' ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Andrew Dunstan Sent: Monday, June 11, 2007 5:12 PM To: Larry McGhaw Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Larry McGhaw wrote: 4) libpq **cannot** describe the maximum internal data size of a char or varchar constant! Example: select '123' from any table This is clearly a bug or serious oversight in libpq that should be addressed. The database *knows* this size of the char constant (obviously), and should report the size via a metadata call, as all other relational databases do. What is not clear to me is why it is so important for you to know the length of a piece of data you are supplying. If it is so vitally important, you could always cast it, e.g. select '123'::varchar(3) We're a middleware company. We are not in control of the queries that are sent. We can intercept and reformat them, and perhaps that is what we will need to do for PostgreSQL ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 5:12 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate. Dann Corbit [EMAIL PROTECTED] writes: May I suggest: http://www-didc.lbl.gov/TCP-tuning/setsockopt.html http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html I poked around on those pages and almost immediately came across http://www.psc.edu/networking/projects/tcptune/ which appears more up-to-date than the other pages, and it specifically recommends *against* setting SO_SNDBUF or SO_RCVBUF on modern Linuxen. So that's one fairly large category where we probably do not want this. It can still be a good idea to set it: http://datatag.web.cern.ch/datatag/howto/tcp.html 64K was just an example. Like I said before, it should be configurable. You have not even made it clear whether you were increasing the sizes in the server-to-client or client-to-server direction, and your handwaving about the test conditions makes it even harder to know what you are measuring. I would think for instance that local vs remote connections make a big difference and might need different tuning. The configuration is a negotiation between client and server. You may or may not get what you ask for. I suggest that it is simple to implement and worthwhile to test. But it was only a suggestion. BTW, if we look at this issue we ought to also look at whether the send/recv quantum in libpq and the backend should be changed. It's been 8K for about ten years now ... I suspect that TCP/IP packetizing will moderate the affects of changes on this parameter. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 5:32 PM To: Larry McGhaw Cc: Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Larry McGhaw [EMAIL PROTECTED] writes: I think perhaps we have lost sight of the main issue: 1) libpq can properly describe the maximum internal data size of any numeric or char column in a table via Pqfsize 2) libpq can properly describe the maximum internal data size of any varchar column via Pqfmod 3) libpq can properly describe the maximum internal data size of any numeric constant in a SQL statement via Pqfsize None of the above statements are actually true, at least not when you take off your blinders and note the existence of unconstrained-width numeric and text columns. Unconstrained width columns are not what are being discussed here. It is constant expressions of known width. The database *knows* this size of the char constant (obviously), No, what it knows (and reports) is type information. There are a small number of datatypes where you can infer a maximum width from knowledge of the datatype. There are many others where you can't set an upper bound from this knowledge --- at least not a usefully tight one. If you do not know how large 1::numeric is, then how can you know whether it is safe or not to insert it into a column of type numeric(12,4)? If you do not know how large 'Joe'::varchar is, then how can you know whether it is safe to insert it into a column of type varchar(256)? Clearly, neither of these operations will cause any problems and so the size of a constant can be determined. Anyway, if we were to cast those constants to something other than unknown, it would be text, not varchar, and you'd still have the same issue. Other database systems can manage this, and the programmers of those database systems are not smarter than the programmers of the PostgreSQL group. Therefore I can conclude that if the PostgreSQL group decides it is important, then they can figure out the size of a string or numeric constant. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Selecting a constant question
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 3:35 PM To: Dann Corbit Cc: Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question Dann Corbit [EMAIL PROTECTED] writes: Giving me the information about the data type will be enough. As an example, in this case we have varchar data. If the server should be so kind as to report varchar(1) for '1' or varchar(3) for '123' then I would not have any difficulty binding the data to a grid. This seems merest fantasy. Reflect on multibyte character sets for a bit --- even if it's known that the column is varchar(3) there is no guarantee that the value will fit in 3 bytes. If the server bound the data as UNICODE, then it will tell me UNICODE(3). I know how big this will be. In the worst case scenario it will fit in 3*4 = 12 bytes. If the server is built without UNICODE enabled, then it will definitely fit in 3 bytes. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
-Original Message- From: Hannu Krosing [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 8:42 PM To: Dann Corbit Cc: Tom Lane; Gregory Stark; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Ühel kenal päeval, E, 2007-06-11 kell 13:38, kirjutas Dann Corbit: -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 1:32 PM To: Dann Corbit Cc: Gregory Stark; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question ... You should be treating typlen as signed not unsigned, and not assuming a fixed width for any negative value. Since the width refers to the server internal representation, and not to what comes down the wire, I find it pretty strange for an application to be using typlen for anything at all actually. Thanks for the response. Since libpq function PQfsize returns -2 for all constant character strings in SQL statements ... What is the proper procedure to determine the length of a constant character column after query execution but before fetching the first row of data? Why not just get the first row and determine the width from it before you actually use any of tha data ? What if the second row is 1000x longer? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match