Re: [HACKERS] Going for "all green" buildfarm results

2006-06-01 Thread Stefan Kaltenbrunner
Tom Lane wrote: > I've been making another pass over getting rid of buildfarm failures. > The remaining ones I see at the moment are: > > firefly HEAD: intermittent failures in the stats test. We seem to have > fixed every other platform back in January, but not this one. > > kudu HEAD: one-time

[HACKERS] Going for "all green" buildfarm results

2006-06-01 Thread Tom Lane
I've been making another pass over getting rid of buildfarm failures. The remaining ones I see at the moment are: firefly HEAD: intermittent failures in the stats test. We seem to have fixed every other platform back in January, but not this one. kudu HEAD: one-time failure 6/1/06 in statement_t

Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-01 Thread Tom Lane
"Andrew Dunstan" <[EMAIL PROTECTED]> writes: > Tom Lane said: >> meerkat and snake both have persistent "CVS-Unknown" failures in some >> but not all branches. I can't see any evidence of an actual failure in >> their logs though. > cvs-unknown means there are unknown files in the repo: Oh. Wel

Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-01 Thread Andrew Dunstan
Joshua D. Drake said: > Tom Lane wrote: >> >> A more radical answer is to have the script go ahead and delete the >> offending files itself, but I can see where that might not have good >> fail-soft behavior ... > > I have manually ran a dist-clean on meerkat for 8_0 and 8_1 and am > rerunning the

Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-01 Thread Andrew Dunstan
Tom Lane said: > meerkat and snake both have persistent "CVS-Unknown" failures in some > but not all branches. I can't see any evidence of an actual failure in > their logs though. What I do see is "?" entries about files that > shouldn't be there --- for instance, meerkat apparently needs a "mak

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Tom Lane
Mark Kirkwood <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> With this model, the disk cost to fetch a single >> index entry will be estimated as random_page_cost (default 4.0) rather >> than the current fixed 2.0. This shouldn't hurt things too much for >> simple indexscans --- especially since

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Mark Kirkwood
Tom Lane wrote: Another thing that's bothering me is that the index access cost computation (in genericcostestimate) is looking sillier and sillier: /* * Estimate the number of index pages that will be retrieved. * * For all currently-supported index types, the first page of

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread David Fetter
On Thu, Jun 01, 2006 at 08:36:16PM -0400, Greg Stark wrote: > > Josh Berkus writes: > > > Greg, Tom, > > > > > a) We already use block based sampling to reduce overhead. If > > > you're talking about using the entire block and not just > > > randomly sampled tuples from within those blocks the

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Greg Stark
Josh Berkus writes: > Greg, Tom, > > > a) We already use block based sampling to reduce overhead. If you're > > talking about using the entire block and not just randomly sampled > > tuples from within those blocks then your sample will be biased. > > There are actually some really good equati

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > Speaking of plan instability, something that's badly needed is the > ability to steer away from query plans that *might* be the most optimal, > but also will fail horribly should the cost estimates be wrong. You sure that doesn't leave us with the empty

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Jim C. Nasby
On Thu, Jun 01, 2006 at 03:15:09PM -0400, Tom Lane wrote: > These would all be nice things to know, but I'm afraid it's pie in the > sky. We have no reasonable way to get those numbers. (And if we could > get them, there would be another set of problems, namely plan instability: > the planner's c

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Jim C. Nasby
On Thu, Jun 01, 2006 at 02:25:56PM -0400, Greg Stark wrote: > > Josh Berkus writes: > > > 1. n-distinct estimation is bad, as previously discussed; > > > > 2. our current heuristics sampling methods prevent us from sampling more > > than > > 0.5% of any reasonably large table, causing all stat

Re: [HACKERS] "CVS-Unknown" buildfarm failures?

2006-06-01 Thread Joshua D. Drake
Tom Lane wrote: meerkat and snake both have persistent "CVS-Unknown" failures in some but not all branches. I can't see any evidence of an actual failure in their logs though. What I do see is "?" entries about files that shouldn't be there --- for instance, meerkat apparently needs a "make dis

[HACKERS] "CVS-Unknown" buildfarm failures?

2006-06-01 Thread Tom Lane
meerkat and snake both have persistent "CVS-Unknown" failures in some but not all branches. I can't see any evidence of an actual failure in their logs though. What I do see is "?" entries about files that shouldn't be there --- for instance, meerkat apparently needs a "make distclean". If that'

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Josh Berkus
Greg, > > 1) You have n^2 possible two-column combinations. That's a lot of > > processing and storage. > > Yes, that's the hard problem to solve.  Actually, btw, it's n!, not n^2. Ooops, bad math. Andrew pointed out it's actually n*(n-1)/2, not n!. Also, we could omit columns unlikely to corre

Re: [HACKERS] Generalized concept of modules

2006-06-01 Thread Tom Lane
Martijn van Oosterhout writes: > Well, in that case I'd like to give some concrete suggestions: > 1. The $libdir in future may be used to find SQL scripts as well as > shared libraries. They'll have different extensions so no possibility > of conflict. No, it needs to be a separate directory, an

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Josh Berkus
Greg, Tom, > a) We already use block based sampling to reduce overhead. If you're > talking about using the entire block and not just randomly sampled > tuples from within those blocks then your sample will be biased. There are actually some really good equations to work with this, estimating bo

Re: [HACKERS] Generalized concept of modules

2006-06-01 Thread Martijn van Oosterhout
On Wed, May 31, 2006 at 05:33:44PM -0400, Tom Lane wrote: > Martijn van Oosterhout writes: > > While you do have a good point about non-binary modules, our module > > handling need some help IMHO. For example, the current hack for CREATE > > LANGUAGE to fix things caused by old pg_dumps. I think t

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Greg Stark
Josh Berkus writes: > > However it will only make sense if people are willing to accept that > > analyze will need a full table scan -- at least for tables where the DBA > > knows that good n_distinct estimates are necessary. > > What about block-based sampling? Sampling 1 in 20 disk pages, r

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Tom Lane
Josh Berkus writes: > Yeah. I've refrained from proposing changes because it's a > pick-up-sticks. If we start modifying the model, we need to fix > *everything*, not just one item. And then educate our users that they > need to use the GUC variables in a different way. Here's the issues I

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Josh Berkus
Greg, > I'm convinced these two are more connected than you believe. Actually, I think they are inseparable. > I might be interested in implementing that algorithm that was posted a > while back that involved generating good unbiased samples of discrete > values. The algorithm was quite clever a

[HACKERS] stable snapshot looks outdated

2006-06-01 Thread Robert Treat
Looking at http://www.postgresql.org/ftp/stable_snapshot/ surely we have acheived stability at least once since 2005-11-26.. :-) Can we get that fixed? -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Greg Stark
Josh Berkus writes: > 1. n-distinct estimation is bad, as previously discussed; > > 2. our current heuristics sampling methods prevent us from sampling more than > 0.5% of any reasonably large table, causing all statistics on those tables to > be off for any table with irregular distribution of

Re: [HACKERS] CTID issues and a soc student in need of help

2006-06-01 Thread Tzahi Fadida
On Thu, 2006-06-01 at 12:45 -0400, Tom Lane wrote: > Tzahi Fadida <[EMAIL PROTECTED]> writes: > > I am not sure about the definition of a context of a single SQL command. > > Well, AFAICS selecting a disjunction ought to qualify as a single SQL > command using a single snapshot. It's not that dif

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-01 Thread Josh Berkus
Tom, As you know, this is something I think about a bit too, though not nearly as deeply as you. In general it seems to me that for CPU-bound databases, the default values of the cpu_xxx_cost variables are too low. I am tempted to raise the default value of cpu_index_tuple_cost to 0.005, whi

Re: [HACKERS] CTID issues and a soc student in need of help

2006-06-01 Thread Tom Lane
Tzahi Fadida <[EMAIL PROTECTED]> writes: > I am not sure about the definition of a context of a single SQL command. Well, AFAICS selecting a disjunction ought to qualify as a single SQL command using a single snapshot. It's not that different from a JOIN or UNION operation, no? > Inside C-langua

Re: [HACKERS] CTID issues and a soc student in need of help

2006-06-01 Thread Tzahi Fadida
I am not sure about the definition of a context of a single SQL command. Example of a run: A <- SELECT getfdr('Relation1,Relation2,Relation3'); to get the result schema (takes a few milliseconds). SELECT * FROM FullDisjunctions('Relation1,Relation2,Relation3') AS RECORD A; Can take a long time.

Re: [HACKERS] CTID issues and a soc student in need of help

2006-06-01 Thread Tom Lane
Tzahi Fadida <[EMAIL PROTECTED]> writes: > I am using CTID for the concept of a tuple set. > For example, the set of t1 from relation1, t1 from relation2, t10 from > relation3 will be represented in my function as a list > of (TableID:CTID) pairs. > For example {(1:1),(2:1),(3:10)) > I then save th

Re: [HACKERS] CTID issues and a soc student in need of help

2006-06-01 Thread Tzahi Fadida
I am using CTID for the concept of a tuple set. For example, the set of t1 from relation1, t1 from relation2, t10 from relation3 will be represented in my function as a list of (TableID:CTID) pairs. For example {(1:1),(2:1),(3:10)) I then save these in bytea arrays in a tuplestore. This is essentia

Re: [HACKERS] CTID issues and a soc student in need of help

2006-06-01 Thread Martijn van Oosterhout
On Thu, Jun 01, 2006 at 03:33:50PM +0300, Tzahi Fadida wrote: > The question is, can the CTID field change throughout > the run of my function due to some other processes working > on the relation? Or because of command boundaries it is > pretty much secured inside an implicit transaction? > The pr

[HACKERS] CTID issues and a soc student in need of help

2006-06-01 Thread Tzahi Fadida
Hi, I am a Google soc student and in need of some help with PostgreSQL internals: My C function can run (and already running) for a very very long time on some inputs and reiterate on relations using SPI. Basically, I open portals and cursors to relations. Also note that I always open the relati

Re: [HACKERS] Possible TODO item: copy to/from pipe

2006-06-01 Thread Mark Woodward
> After re-reading what I just wrote to Andreas about how compression of > COPY data would be better done outside the backend than inside, it > struck me that we are missing a feature that's fairly common in Unix > programs. Perhaps COPY ought to have the ability to pipe its output > to a shell co

Re: [HACKERS] session id and global storage

2006-06-01 Thread Andrew Dunstan
Hannu Krosing said: > Ühel kenal päeval, N, 2006-06-01 kell 10:10, kirjutas David Hoksza: >> It seems MyProcID is what I was searching for... >> > > On a buzy server with lots of connects, procID will repeat quite often. > log_line-prefix has a sessionid gadget: Session ID: A unique identifier

Re: [HACKERS] session id and global storage

2006-06-01 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-06-01 kell 10:10, kirjutas David Hoksza: > It seems MyProcID is what I was searching for... > On a buzy server with lots of connects, procID will repeat quite often. -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn

Re: [HACKERS] copy with compression progress n

2006-06-01 Thread Hannu Krosing
Ühel kenal päeval, K, 2006-05-31 kell 17:31, kirjutas Andreas Pflug: > Tom Lane wrote: > > Andreas Pflug <[EMAIL PROTECTED]> writes: > > > >>The attached patch implements COPY ... WITH [BINARY] COMPRESSION > >>(compression implies BINARY). The copy data uses bit 17 of the flag > >>field to ident

Re: [HACKERS] Possible TODO item: copy to/from pipe

2006-06-01 Thread Dawid Kuroczko
On 5/31/06, Tom Lane <[EMAIL PROTECTED]> wrote: After re-reading what I just wrote to Andreas about how compression of COPY data would be better done outside the backend than inside, it struck me that we are missing a feature that's fairly common in Unix programs. Perhaps COPY ought to have the

Re: [HACKERS] session id and global storage

2006-06-01 Thread David Hoksza
It seems MyProcID is what I was searching for... David Hoksza DH> Something like this would be maybe possible, but this select can DH> return more rows, when the user is connected with more instances... DH> David Hoksza DH> >>>