Re: [HACKERS] ScanKey representation for RowCompare index

2006-01-16 Thread Simon Riggs
On Sun, 2006-01-15 at 18:03 -0500, Tom Lane wrote: There's one nontrivial decision still to make about how to implement proper per-spec row-comparison operations, namely: how a row comparison ought to be represented in the index access method API. I'm not sure I understand why. Surely a row

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Simon Riggs
On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A couple of days ago I found myself wanting to aggregate 3 Billion tuples down to 100 Million tuples based on an integer key with six integer values -- six sum()'s. PostgreSQL ran out of memory with its Hash Aggregator and doing an old

Re: [HACKERS] Coding standards? Recommendations?

2006-01-16 Thread Simon Riggs
On Sun, 2006-01-15 at 11:14 -0500, korry wrote: I've noticed a variety of coding styles in the PostgreSQL source code. In particular, I see a mix of naming conventions. Some variables use camelCase (or CamelCase), others use under_score_style. I just follow the style of the module I'm

[HACKERS] source documentation tool doxygen

2006-01-16 Thread Joachim Wieland
I've created a browsable source tree documentation, it's done with the doxygen tool. http://www.mcknight.de/pgsql-doxygen/cvshead/html/ There was a discussion about this some time ago, Jonathan Gardner proposed it here: http://archives.postgresql.org/pgsql-hackers/2004-03/msg00748.php quite a

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Thomas Hallgren
I wish I've had this when I started working with PostgreSQL. This looks really good. Very useful indeed, even without the comments. What kind of changes are needed in order to get the comments in? Regards, Thomas Hallgren Joachim Wieland wrote: I've created a browsable source tree

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Kim Bisgaard
Try following the link (the Doxygen icon) - it has both a tutorial and extensive doc. Regards, Kim Bisgaard Thomas Hallgren wrote: I wish I've had this when I started working with PostgreSQL. This looks really good. Very useful indeed, even without the comments. What kind of changes are

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Andrew Dunstan
Thomas Hallgren said: I wish I've had this when I started working with PostgreSQL. This looks really good. Very useful indeed, even without the comments. What kind of changes are needed in order to get the comments in? I too have done this. But retrofitting Doxygen style comments to the

[HACKERS] [PATCH] Better way to check for getaddrinfo function.

2006-01-16 Thread R, Rajesh (STSD)
Title: [PATCH] Better way to check for getaddrinfo function. Just thought that the following patch might improve checking for getaddrinfo function (in configure.in) I was forced to write 'coz getaddrinfo went unnoticed in Tru64 Unix. (displaying attached patch) $ diff -r configure.in

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Rod Taylor
On Mon, 2006-01-16 at 08:32 +, Simon Riggs wrote: On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A couple of days ago I found myself wanting to aggregate 3 Billion tuples down to 100 Million tuples based on an integer key with six integer values -- six sum()'s. PostgreSQL ran

Re: [HACKERS] ScanKey representation for RowCompare index conditions

2006-01-16 Thread Martijn van Oosterhout
On Sun, Jan 15, 2006 at 06:03:12PM -0500, Tom Lane wrote: There's one nontrivial decision still to make about how to implement proper per-spec row-comparison operations, namely: how a row comparison ought to be represented in the index access method API. The current representation of index

Re: [HACKERS] ScanKey representation for RowCompare index conditions

2006-01-16 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: On Sun, 2006-01-15 at 18:03 -0500, Tom Lane wrote: There's one nontrivial decision still to make about how to implement proper per-spec row-comparison operations, namely: how a row comparison ought to be represented in the index access method API. I'm

[HACKERS] Docs off on ILIKE indexing?

2006-01-16 Thread Magnus Hagander
http://www.postgresql.org/docs/8.1/static/indexes-types.html says: The optimizer can also use a B-tree index for queries involving the pattern matching operators LIKE, ILIKE, ~, and ~*, if the pattern is a constant and is anchored to the beginning of the string - for example, col LIKE 'foo%' or

Re: [HACKERS] ScanKey representation for RowCompare index conditions

2006-01-16 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: ISTM that row-wise comparisons, as far as indexes are concerned are actually simpler than normal scan-keys. For example, if you have the condition (a,b) = (5,1) then once the index has found that point, every subsequent entry in the index

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A couple of days ago I found myself wanting to aggregate 3 Billion tuples down to 100 Million tuples based on an integer key with six integer values -- six sum()'s. There is already hash table overflow

Re: [HACKERS] [GENERAL] [PATCH] Better way to check for getaddrinfo function.

2006-01-16 Thread Tom Lane
R, Rajesh (STSD) [EMAIL PROTECTED] writes: Just thought that the following patch might improve checking for getaddrinfo function (in configure.in) Since AC_TRY_RUN tests cannot work in cross-compilation scenarios, you need an *extremely* good reason to put one in. I thought this might improve

Re: [HACKERS] Docs off on ILIKE indexing?

2006-01-16 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes: http://www.postgresql.org/docs/8.1/static/indexes-types.html says: The optimizer can also use a B-tree index for queries involving the pattern matching operators LIKE, ILIKE, ~, and ~*, if the pattern is a constant and is anchored to the beginning of

Re: [HACKERS] Warm-up cache may have its virtue

2006-01-16 Thread Jim C. Nasby
On Sat, Jan 14, 2006 at 04:13:56PM -0500, Qingqing Zhou wrote: Qingqing Zhou [EMAIL PROTECTED] wrote I wonder if we should really implement file-system-cache-warmup strategy which we have discussed before. There are two natural good places to do this: (1) sequentail scan (2)

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-16 Thread Jim C. Nasby
On Fri, Jan 13, 2006 at 11:37:38PM -0500, Tom Lane wrote: Josh Berkus josh@agliodbs.com writes: It's also worth mentioning that for datatypes that only have an = operator the performance of compute_minimal_stats is O(N^2) when values are unique, so increasing sample size is a very bad idea

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Neil Conway
On Mon, 2006-01-16 at 07:57 -0600, Andrew Dunstan wrote: I too have done this. But retrofitting Doxygen style comments to the PostgreSQL source code would be a big undertaking. Maintaining it, which would be another task for reviewers/committers, would also be a pain unless there were some

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes: I don't think it would be all that painful. There would be no need to convert the entire source tree to use proper Doxygen-style comments in one fell swoop: individual files and modules can be converted whenever anyone gets the inclination to do so. I

Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-16 Thread Jim C. Nasby
On Sat, Jan 14, 2006 at 07:28:21PM +0900, Michael Glaesemann wrote: On Jan 13, 2006, at 21:42 , Leandro Guimar?es Faria Corcete DUTRA wrote: If you still declare the natural key(s) as UNIQUEs, you have just made performance worse. Now there are two keys to be checked on UPDATEs and

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Simon Riggs
On Mon, 2006-01-16 at 09:42 -0500, Rod Taylor wrote: On Mon, 2006-01-16 at 08:32 +, Simon Riggs wrote: On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A question: Are the rows in your 3 B row table clumped together based upon the 100M row key? (or *mostly* so) We might also be

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Simon Riggs
On Mon, 2006-01-16 at 12:36 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2006-01-16 at 00:07 -0500, Rod Taylor wrote: A couple of days ago I found myself wanting to aggregate 3 Billion tuples down to 100 Million tuples based on an integer key with six integer

Re: [HACKERS] PostgreSQL win32 NT4

2006-01-16 Thread Bruce Momjian
[EMAIL PROTECTED] wrote: NT4 is officially dead, IMHO no need for PostgreSQL to officially support it, let's leave place for companies offering commercial postgresql versions to work on it if they have enough customer requests. BTW Win 2000 is more or less 6 years old now ... Agreed. If they

Re: [HACKERS] PostgreSQL win32 NT4

2006-01-16 Thread Joshua D. Drake
Bruce Momjian wrote: [EMAIL PROTECTED] wrote: NT4 is officially dead, IMHO no need for PostgreSQL to officially support it, let's leave place for companies offering commercial postgresql versions to work on it if they have enough customer requests. BTW Win 2000 is more or less 6 years old now

Re: [HACKERS] PostgreSQL win32 NT4

2006-01-16 Thread Magnus Hagander
NT4 is officially dead, IMHO no need for PostgreSQL to officially support it, let's leave place for companies offering commercial postgresql versions to work on it if they have enough customer requests. BTW Win 2000 is more or less 6 years old now ... I believe Microsoft has an

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: For HJ we write each outer tuple to its own file-per-batch in the order they arrive. Reading them back in preserves the original ordering. So yes, caution required, but I see no difficulty, just reworking the HJ code (nodeHashjoin and nodeHash). What else

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-16 Thread Simon Riggs
On Mon, 2006-01-16 at 12:26 -0600, Jim C. Nasby wrote: On Fri, Jan 13, 2006 at 11:37:38PM -0500, Tom Lane wrote: Josh Berkus josh@agliodbs.com writes: It's also worth mentioning that for datatypes that only have an = operator the performance of compute_minimal_stats is O(N^2) when values

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-16 Thread Manfred Koizar
On Fri, 13 Jan 2006 19:18:29 +, Simon Riggs [EMAIL PROTECTED] wrote: I enclose a patch for checking out block sampling. Can't comment on the merits of block sampling and your implementation thereof. Just some nitpicking: |! * Row Sampling: As of May 2004, we use the Vitter algorithm to

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-16 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: Tom has not spoken against checking for UNIQUE constraints: he is just pointing out that there never could be a constraint in the case I was identifying. More generally, arguing for or against any system-wide change on the basis of performance of

[HACKERS] Anyone see a need for BTItem/HashItem?

2006-01-16 Thread Tom Lane
I'm considering getting rid of the BTItem/BTItemData and HashItem/HashItemData struct definitions and just referencing IndexTuple(Data) directly in the btree and hash AMs. It appears that at one time in the forgotten past, there was some access-method-specific data in index entries in addition to

Re: [HACKERS] Anyone see a need for BTItem/HashItem?

2006-01-16 Thread David Fetter
On Mon, Jan 16, 2006 at 03:52:01PM -0500, Tom Lane wrote: I'm considering getting rid of the BTItem/BTItemData and HashItem/HashItemData struct definitions and just referencing IndexTuple(Data) directly in the btree and hash AMs. It appears that at one time in the forgotten past, there was

Re: [HACKERS] Anyone see a need for BTItem/HashItem?

2006-01-16 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes: On Mon, Jan 16, 2006 at 03:52:01PM -0500, Tom Lane wrote: Does anyone see a reason to keep this layer of struct definitions? If you cut it out, what will the heap and index access methods needed for SQL/MED use? What's that have to do with this?

Re: [HACKERS] Anyone see a need for BTItem/HashItem?

2006-01-16 Thread Jonah H. Harris
>From what I've seen, I don't think we need to keep them around.On 1/16/06, Tom Lane [EMAIL PROTECTED] wrote: I'm considering getting rid of the BTItem/BTItemData andHashItem/HashItemData struct definitions and just referencing IndexTuple(Data) directly in the btree and hash AMs.It appears thatat

Re: [HACKERS] Anyone see a need for BTItem/HashItem?

2006-01-16 Thread David Fetter
On Mon, Jan 16, 2006 at 04:02:07PM -0500, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: On Mon, Jan 16, 2006 at 03:52:01PM -0500, Tom Lane wrote: Does anyone see a reason to keep this layer of struct definitions? If you cut it out, what will the heap and index access methods

[HACKERS] equivalence class not working?

2006-01-16 Thread uwcssa
not sure if this is the right place to post... I am using postgres 8.1. In indxpath.c, it says Note: if Postgres tried to optimize queries by forming equivalenceclasses over equi-joined attributes (i.e., if it recognized that a qualification such as where a.b=c.d and a.b=5 could make use ofan

Re: [HACKERS] Anyone see a need for BTItem/HashItem?

2006-01-16 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes: On Mon, Jan 16, 2006 at 04:02:07PM -0500, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: If you cut it out, what will the heap and index access methods needed for SQL/MED use? What's that have to do with this? I'm sure you'll correct me if

Re: [HACKERS] equivalence class not working?

2006-01-16 Thread Tom Lane
uwcssa [EMAIL PROTECTED] writes: I am using postgres 8.1. In indxpath.c, it says Note: if Postgres tried to optimize queries by forming equivalence classes over equi-joined attributes (i.e., if it recognized that aqualification such as where a.b=3Dc.d and a.b=3D5 could make use of an

Re: [HACKERS] ScanKey representation for RowCompare index conditions

2006-01-16 Thread Martijn van Oosterhout
On Mon, Jan 16, 2006 at 12:07:44PM -0500, Tom Lane wrote: Since you didn't understand what I was saying, I suspect that plan A is too confusing ... Umm, yeah. Now you've explained it I think it should be excluded on the basis that it'll be a source of bugs. For all the places that matter a

Re: [HACKERS] Anyone see a need for BTItem/HashItem?

2006-01-16 Thread David Fetter
On Mon, Jan 16, 2006 at 04:21:50PM -0500, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: On Mon, Jan 16, 2006 at 04:02:07PM -0500, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: If you cut it out, what will the heap and index access methods needed for SQL/MED use?

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Michael Glaesemann
On Jan 17, 2006, at 3:51 , Tom Lane wrote: A quick look through the doxygen manual doesn't make it sound too invasive, but I am worried about how well it will coexist with pgindent. It seems both tools think they can dictate the meaning of the characters immediately after /* of a comment

Re: [HACKERS] ScanKey representation for RowCompare index conditions

2006-01-16 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: On Mon, Jan 16, 2006 at 12:07:44PM -0500, Tom Lane wrote: Since you didn't understand what I was saying, I suspect that plan A is too confusing ... Umm, yeah. Now you've explained it I think it should be excluded on the basis that it'll be a

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Joachim Wieland
On Tue, Jan 17, 2006 at 06:51:15AM +0900, Michael Glaesemann wrote: I haven't looked at it yet, but might there not be a way to have a preprocessing step where the current comment format is converted to something doxygen-friendly? pg_indent2doxygen or something? Then the current comment style

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Bruce Momjian
Wow, looks great. Is that URL stable? Can we link to it from the PostgreSQL developers page? http://www.postgresql.org/developer/coding --- Joachim Wieland wrote: I've created a browsable source tree

Re: [pgsql-www] [HACKERS] source documentation tool doxygen

2006-01-16 Thread Marc G. Fournier
the only question I have ... is there any way of improving that right index? Love the 'detail pages', but somehow making the right index more 'tree like' might make navigation a bit easier ... On Mon, 16 Jan 2006, Bruce Momjian wrote: Wow, looks great. Is that URL stable? Can we link

Re: [HACKERS] equivalence class not working?

2006-01-16 Thread uwcssa
Fine. The rest documentation says:For now, the test only uses restriction clauses (those in restrictinfo_list). --Nels, Dec '92, however, I understand it as being overridden by the followup, which is:XXX as of 7.1, equivalence class info *is* available. Considerimproving this code as foreseen by

Re: [pgsql-www] [HACKERS] source documentation tool doxygen

2006-01-16 Thread Michael Glaesemann
On Jan 17, 2006, at 8:40 , Marc G. Fournier wrote: the only question I have ... is there any way of improving that right index? Love the 'detail pages', but somehow making the right index more 'tree like' might make navigation a bit easier ... Along those lines, I wonder if a CSS

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-16 Thread Simon Riggs
On Mon, 2006-01-16 at 21:24 +0100, Manfred Koizar wrote: On Fri, 13 Jan 2006 19:18:29 +, Simon Riggs [EMAIL PROTECTED] wrote: I enclose a patch for checking out block sampling. Can't comment on the merits of block sampling and your implementation thereof. But your thoughts are

Re: [pgsql-www] [HACKERS] source documentation tool doxygen

2006-01-16 Thread Robert Treat
This was my plan all along, just been waiting for someone to make it work with the postgresql code and then send instructions to the postgresql web team on how to set it up. Robert Treat On Monday 16 January 2006 18:32, Bruce Momjian wrote: Wow, looks great. Is that URL stable? Can we link

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Simon Riggs
On Mon, 2006-01-16 at 14:43 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: For HJ we write each outer tuple to its own file-per-batch in the order they arrive. Reading them back in preserves the original ordering. So yes, caution required, but I see no difficulty, just

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: Sure hash table is dynamic, but we read all inner rows to create the hash table (nodeHash) before we get the outer rows (nodeHJ). But our idea of the number of batches needed can change during that process, resulting in some inner tuples being initially

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Jonathan Gardner
On Monday 16 January 2006 10:51, Tom Lane wrote: Neil Conway [EMAIL PROTECTED] writes: I don't think it would be all that painful. There would be no need to convert the entire source tree to use proper Doxygen-style comments in one fell swoop: individual files and modules can be converted

[HACKERS] FW: PGBuildfarm member firefly Branch HEAD Failed at Stage Check

2006-01-16 Thread Larry Rosenman
PG Build Farm wrote: The PGBuildfarm member firefly had the following event on branch HEAD: Failed at Stage: Check The snapshot timestamp for the build that triggered this notification is: 2006-01-17 03:27:00 The specs of this machine are: OS: UnixWare / 7.1.4 Arch: i386 Comp: cc

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: Why would we continue to dynamically build the hash table after the start of the outer scan? The number of tuples written to a temp file might exceed what we want to hold in memory; we won't detect this until the batch is read back in, and in that case

Re: [HACKERS] source documentation tool doxygen

2006-01-16 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes: Wow, looks great. Is that URL stable? Can we link to it from the PostgreSQL developers page? The thing seems to have only the very vaguest grasp on whether it is parsing C or C++ ... or should I say that it is convinced it is parsing C++ despite

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes: For a hash aggregate would it be possible to rescan the original table instead of spilling to temporary files? Sure, but the possible performance gain is finite and the possible performance loss is not. The original table could be an extremely expensive

Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-16 Thread Simon Riggs
On Mon, 2006-01-16 at 20:02 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Sure hash table is dynamic, but we read all inner rows to create the hash table (nodeHash) before we get the outer rows (nodeHJ). But our idea of the number of batches needed can change during that

Re: [HACKERS] [GENERAL] [PATCH] Better way to check for getaddrinfo function.

2006-01-16 Thread R, Rajesh (STSD)
Title: RE: [GENERAL] [PATCH] Better way to check for getaddrinfo function. That was very much situation specific. But the bottomline is the default test does not include netdb.h in the test code. So, pg uses getaddrinfo.c.And the getaddrinfo.c does not work for me. Ipv6 client