Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Martijn van Oosterhout
On Thu, Oct 16, 2008 at 09:17:03PM -0600, Joshua Tolley wrote: Because I'm trying to picture geometrically how this might work for the two-column case, and hoping to extend that to more dimensions, and am finding that picturing a quantile-based system like the one we have now in multiple

Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Martijn van Oosterhout
On Fri, Oct 17, 2008 at 12:20:58AM +0200, Greg Stark wrote: Correlation is the wrong tool. In fact zip codes and city have nearly zero correlation. Zip codes near 0 are no more likely to be in cities starting with A than Z. I think we need to define our terms better. In terms of

Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Gregory Stark
Martijn van Oosterhout [EMAIL PROTECTED] writes: On Fri, Oct 17, 2008 at 12:20:58AM +0200, Greg Stark wrote: Correlation is the wrong tool. In fact zip codes and city have nearly zero correlation. Zip codes near 0 are no more likely to be in cities starting with A than Z. I think

[HACKERS] WIP: grouping sets support

2008-10-17 Thread Pavel Stehule
Hello, there is basic support of GROUPING SETS. This patch isn't ready for production! It's only material for discus. Parser and executor are done. Planner part should be completely rewritten. I don't understand well to planner, and I haven't idea that parts should be shared with groupby.

Re: [HACKERS] contrib/pg_stat_statements

2008-10-17 Thread Vladimir Sitnikov
Decibel! [EMAIL PROTECTED] wrote: I had tried to use a normal table for store stats information, but several acrobatic hacks are needed to keep performance. I guess it is not really required to synchronize the stats into some physical table immediately. I would suggest keeping all the data

Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Richard Huxton
Gregory Stark wrote: They're certainly very much not independent variables. There are lots of ways of measuring how much dependence there is between them. I don't know enough about the math to know if your maps are equivalent to any of them. I think dependency captures the way I think about it

Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Tom Lane
Martijn van Oosterhout [EMAIL PROTECTED] writes: Just a note: using a multidimensional histograms will work well for the cases like (startdate,enddate) where the histogram will show a clustering of values along the diagonal. But it will fail for the case (zipcode,state) where one implies the

Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED

2008-10-17 Thread Andrew Chernow
Andrew Dunstan wrote: Has anyone considered not using a file lock on windows? CreateMutex might do the trick if provided a mutex name, making it global rather than process bound. OpenMutex can be used to test if the mutex exists or if it is currently locked. I guess it would stay locked.

[HACKERS] Hot Standby: First integrated patch

2008-10-17 Thread Simon Riggs
First integrated patch for Hot Standby, allowing queries to be executed while in recovery mode. The patch tests successfully with the enclosed files: * primary_setup_test.sql - run it on primary node * standby_allowed.sql - run on standby - should all succeed * standby_disallowed.sql - run on

Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED

2008-10-17 Thread Andrew Dunstan
Andrew Chernow wrote: Andrew Dunstan wrote: Has anyone considered not using a file lock on windows? CreateMutex might do the trick if provided a mutex name, making it global rather than process bound. OpenMutex can be used to test if the mutex exists or if it is currently locked. I

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Alvaro Herrera
So this discussion died with no solution arising to the hint-bit-setting-invalidates-the-CRC problem. Apparently the only solution in sight is to WAL-log hint bits. Simon opines it would be horrible from a performance standpoint to WAL-log every hint bit set, and I think we all agree with that.

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Jonah H. Harris
On Fri, Oct 17, 2008 at 11:26 AM, Alvaro Herrera [EMAIL PROTECTED] wrote: So this discussion died with no solution arising to the hint-bit-setting-invalidates-the-CRC problem. I've been busy. Apparently the only solution in sight is to WAL-log hint bits. Simon opines it would be horrible

Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED

2008-10-17 Thread Andrew Chernow
Andrew Dunstan wrote: Andrew Chernow wrote: Andrew Dunstan wrote: Has anyone considered not using a file lock on windows? CreateMutex might do the trick if provided a mutex name, making it global rather than process bound. OpenMutex can be used to test if the mutex exists or if it is

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Greg Stark
I'm far from convinced wal logging hint bits is a non starter. In fact I doubt the wal log itself I a problem. Having to take the buffer lock does suck though. Heikki had a clever idea earlier which was to have two crc checks- one which skips the hint bits and one dedicated to hint bits.

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Jonah H. Harris
On Fri, Oct 17, 2008 at 12:05 PM, Greg Stark [EMAIL PROTECTED] wrote: Heikki had a clever idea earlier which was to have two crc checks- one which skips the hint bits and one dedicated to hint bits. If the second doesn't match we clear all the hint bits. Sounds overcomplicated to me. The

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Aidan Van Dyk
* Greg Stark [EMAIL PROTECTED] [081017 12:05]: I'm far from convinced wal logging hint bits is a non starter. In fact I doubt the wal log itself I a problem. Having to take the buffer lock does suck though. And remember, you don't even need to WAL all the hint-bit setts:... You only *need*

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Simon Riggs
On Fri, 2008-10-17 at 12:26 -0300, Alvaro Herrera wrote: Apparently the only solution in sight is to WAL-log hint bits. Simon opines it would be horrible from a performance standpoint to WAL-log every hint bit set, and I think we all agree with that. So we need to find an alternative

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Alvaro Herrera
Simon Riggs wrote: On Fri, 2008-10-17 at 12:26 -0300, Alvaro Herrera wrote: Apparently the only solution in sight is to WAL-log hint bits. Simon opines it would be horrible from a performance standpoint to WAL-log every hint bit set, and I think we all agree with that. So we need to

Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Ron Mayer
Tom Lane wrote: A bad estimate for physical-position correlation has only limited impact, Ah! This seems very true with 8.3 but much less true with 8.0. On a legacy 8.0 system I have a hard time avoiding cases where a query like select * from addresses where add_state_or_province = 'CA';

Re: [HACKERS] Improving planner variable handling

2008-10-17 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane tgl(at)sss(dot)pgh(dot)pa(dot)us writes: * When we have a non-nullable expression in a sub-select's targetlist, and it's below an outer join, replace the expression by CASE WHEN flag_var THEN original_expression ELSE NULL END and then

Re: [HACKERS] small bug in hlCover

2008-10-17 Thread Teodor Sigaev
Thank you, patch applied. Sushant Sinha wrote: Has any one noticed this? -Sushant. On Wed, 2008-07-16 at 23:01 -0400, Sushant Sinha wrote: I think there is a slight bug in hlCover function in wparser_def.c If there is only one query item and that is the first word in the text, then hlCover

[HACKERS] Incorrect cursor behaviour with gist index

2008-10-17 Thread Teodor Sigaev
I'm back, sorry for a long absence. About this bug: http://archives.postgresql.org/pgsql-bugs/2008-09/msg00086.php Unfortunately, GiST index doesn't work with change direction of scan. I.e. it can't move forward then move backward and this behaviour was from the beginning. I think it's

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Simon Riggs
On Fri, 2008-10-17 at 13:59 -0300, Alvaro Herrera wrote: It might be possible to have a partial solution where some blocks have CRC checks, some not. That's another idea but it reduces the effectiveness of the check. If you put in a GUC to control the check, block by block. 0 = check

Re: [HACKERS] Incorrect cursor behaviour with gist index

2008-10-17 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes: I'm back, sorry for a long absence. About this bug: http://archives.postgresql.org/pgsql-bugs/2008-09/msg00086.php Unfortunately, GiST index doesn't work with change direction of scan. I.e. it can't move forward then move backward and this behaviour

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Greg Stark
I don't think that works anyways. No matter how thoroughly you update all the hint bits there's still a chance someone else comes along and sets one you missed or is setting hint bits on the same tuple at the same time and your update gets lost. greg On 17 Oct 2008, at 06:59 PM, Alvaro

Re: [HACKERS] Incorrect cursor behaviour with gist index

2008-10-17 Thread Teodor Sigaev
Seems like a lotta work for a partial solution :-(. Probably the path of least resistance is to teach the planner that only some index AMs can do backwards scan. That would result in a Materialize buffer getting placed in front of the query if the user demanded scroll capability, but it would

Re: [HACKERS] Hot Standby: First integrated patch

2008-10-17 Thread Merlin Moncure
On Fri, Oct 17, 2008 at 10:38 AM, Simon Riggs [EMAIL PROTECTED] wrote: First integrated patch for Hot Standby, allowing queries to be executed while in recovery mode. The patch tests successfully with the enclosed files: * primary_setup_test.sql - run it on primary node *

Re: [HACKERS] Incorrect cursor behaviour with gist index

2008-10-17 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes: Seems like a lotta work for a partial solution :-(. Probably the path of least resistance is to teach the planner that only some index AMs can do backwards scan. That would result in a Materialize buffer getting placed in front of the query if the user

Re: [HACKERS] Incorrect cursor behaviour with gist index

2008-10-17 Thread Teodor Sigaev
to use it when the AM can't guarantee to return the same sequence of tuples after backing up. So I think it would be sufficient to have gistmarkpos et al throw error if called. Why not to remove gistrestrpos/gistmarkpos/ginrestrpos/ginmarkpos from pg_am table? -- Teodor Sigaev

Re: [HACKERS] Incorrect cursor behaviour with gist index

2008-10-17 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes: to use it when the AM can't guarantee to return the same sequence of tuples after backing up. So I think it would be sufficient to have gistmarkpos et al throw error if called. Why not to remove gistrestrpos/gistmarkpos/ginrestrpos/ginmarkpos from

Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Nathan Boley
Right now our histogram values are really quantiles; the statistics_target T for a column determines a number of quantiles we'll keep track of, and we grab values from into an ordered list L so that approximately 1/T of the entries in that column fall between values L[n] and L[n+1]. I'm

Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Joshua Tolley
On Fri, Oct 17, 2008 at 3:47 PM, Nathan Boley [EMAIL PROTECTED] wrote: Right now our histogram values are really quantiles; the statistics_target T for a column determines a number of quantiles we'll keep track of, and we grab values from into an ordered list L so that approximately 1/T of

Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Nathan Boley
I'm still working my way around the math, but copulas sound better than anything else I've been playing with. I think the easiest way to think of them is, in 2-D finite spaces, they are just a plot of the order statistics against one another. Feel free to mail me off list if you have any math