Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-04 Thread Greg Stark

Greg Stark [EMAIL PROTECTED] writes:

 Perhaps what this indicates is that the real meat is in track sampling, not
 block sampling.

Fwiw, I've done a little benchmarking and I'm starting to think this isn't a
bad idea. I see a dramatic speed improvement for samples of 1-10% as the block
size increases. Presumably this is as Hannu said, reducing the number of
tracks necessary to cover the sample.

I see improvements up to around 256M blocks or so, but my data is pretty
questionable since I'm busy watching tv in Mythtv in another window. It's on
another drive but it still seems to be making the numbers jump around a bit.

I expect there's a trade-off between keeping enough blocks for the sample of
blocks to be representative on the one hand and large blocks being much faster
to read in on the other.

I would suggest something like setting the block size in the block sampling
algorithm to something like max(8k,sqrt(table size)). That gives 8k blocks for
anything up to 255M but takes better advantage of the speed increase available
from sequential i/o for larger tables, from my experiments about a 50%
increase in speed. 

Actually maybe even something even more aggressive would be better, maybe
(table size)^.75 So it kicks in sooner than on 256M tables and gets to larger
block sizes on reasonable sized tables.

Note, this doesn't mean anything like changing page sizes, just selecting more
blocks that hopefully lie on the same track when possible.

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-04 Thread Greg Stark
Greg Stark [EMAIL PROTECTED] writes:

 I see improvements up to around 256M blocks or so, but my data is pretty
 questionable since I'm busy watching tv in Mythtv in another window. It's on
 another drive but it still seems to be making the numbers jump around a bit.

Oops, I meant blocks of 256k there. Sorry.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Possible TODO item: copy to/from pipe

2006-06-04 Thread Andreas Pflug

Bruce Momjian wrote:


For use case, consider this:

COPY mytable TO '| rsh [EMAIL PROTECTED]  test ';

so you can COPY to another server directly.
  

Why not rsh psql -c \copy foobar to test ?

Regards,
Andreas


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-04 Thread Andrew Dunstan
I said:


 Another option would be to re-run cvs up one more time if we get any
 unexpected files. It sounds like that would fix this issue on windows
 machines, while still ensuring we had a clean repo to work from.


 please see the new release of the buildfarm client, in which I have
 followed Tom's suggestion of removing the -P flag from the checkout and
 update commands - that should solve the Windows problem, as it will no
 longer try to remove the directory. I hope that solves the problem -
 if  not I'll have a look at other solutions.


Unfortunately, this fell over first time out:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=lorisdt=2006-06-04%2012:09:33
The fix handled directories, but we got a false positive from a rename not
being immediate either, it seems. Bloody Windows!

One thought I had was to force Windows to use CVS export rather than update.
This has 2 disadvantages: it requires a complete repo fetch every run, even
if we don't need to do anything because nothing has changed, and it also
means we can't report the version numbers on files changed. Example:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=lorisdt=2006-06-04%2012:21:43
So what I'm going to try instead is a variation on Jim's suggestion above,
but instead of re-running cvs update, what we'll do is a longish sleep (say
10 or 20 secs) which should be enough time for Windows to get its act
together, and then run cvs status, which will also show us extraneous files.

thoughts?

cheers

andrew



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-04 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Another option would be to re-run cvs up one more time if we get any
 unexpected files. It sounds like that would fix this issue on windows
 machines, while still ensuring we had a clean repo to work from.

 So what I'm going to try instead is a variation on Jim's suggestion above,
 but instead of re-running cvs update, what we'll do is a longish sleep (say
 10 or 20 secs) which should be enough time for Windows to get its act
 together, and then run cvs status, which will also show us extraneous files.

Yeah, this is probably OK since you only need to do it if you see any ?
entries in the cvs update.  Another low-tech solution is to sleep a bit
and then see if any of the files/directories listed in ? entries are
still there.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-04 Thread Magnus Hagander
 Unfortunately, this fell over first time out:
 http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=lorisdt=200
6-06-04%2012:09:33
 The fix handled directories, but we got a false positive from 
 a rename not being immediate either, it seems. Bloody Windows!

Are you running this from msys or from actual windows? I haven't
observed this outside msys, in which case it might be an idea to execute
it elsewhere, but I don't think I've done things like it enough to be
sure that makes a difference..


//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-04 Thread Greg Stark

Hannu Krosing [EMAIL PROTECTED] writes:

 Ühel kenal päeval, L, 2006-06-03 kell 10:43, kirjutas Jim Nasby:
 
  Might also be worth adding analyze delay settings, ala  
  vacuum_cost_delay.
 
 Actually we should have delay settings for all potential
 (almost-)full-scan service ops, - VACUUM, ANALYSE, CREATE INDEX, ADD
 CONSTRAINT, maybe more - so that there would be better chances of
 running those on busy databases without disastrous effects.

What about UPDATE and DELETE and for that matter SELECT?


-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Possible TODO item: copy to/from pipe

2006-06-04 Thread Greg Stark
Andreas Pflug [EMAIL PROTECTED] writes:

 Bruce Momjian wrote:
 
  For use case, consider this:
 
  COPY mytable TO '| rsh [EMAIL PROTECTED]  test ';
 
  so you can COPY to another server directly.
 
 Why not rsh psql -c \copy foobar to test ?

Who knows? The sysadmin could have any reason to prefer one to the other. 

One that comes to mind is that he or she may want to automate this and may be
happier granting password-free access to a filesystem server that just holds
encrypted backups than to the live production database.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-04 Thread Greg Stark
Andrew Dunstan [EMAIL PROTECTED] writes:

 One thought I had was to force Windows to use CVS export rather than update.
 This has 2 disadvantages: it requires a complete repo fetch every run, even
 if we don't need to do anything because nothing has changed, and it also
 means we can't report the version numbers on files changed. 

You could also just have the windows machine rsync the directory from one of
the other build machines?


-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-04 Thread Andrew Dunstan



Greg Stark wrote:


Andrew Dunstan [EMAIL PROTECTED] writes:

 


One thought I had was to force Windows to use CVS export rather than update.
This has 2 disadvantages: it requires a complete repo fetch every run, even
if we don't need to do anything because nothing has changed, and it also
means we can't report the version numbers on files changed. 
   



You could also just have the windows machine rsync the directory from one of
the other build machines?

 




The farm is distributed - none of the members have any knowledge of the 
others. And it is a design requirement that no inbound access is 
required for buildfarm members, and that no tools are required other 
than  those that are required to build postgres.


Anyway, I think we have it covered now,.

cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-04 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Hannu Krosing [EMAIL PROTECTED] writes:
 Ühel kenal päeval, L, 2006-06-03 kell 10:43, kirjutas Jim Nasby:
 Might also be worth adding analyze delay settings, ala  
 vacuum_cost_delay.

ANALYZE already respects the vacuum delay settings.

 Actually we should have delay settings for all potential
 (almost-)full-scan service ops, - VACUUM, ANALYSE, CREATE INDEX, ADD
 CONSTRAINT, maybe more - so that there would be better chances of
 running those on busy databases without disastrous effects.

 What about UPDATE and DELETE and for that matter SELECT?

This seems pretty silly.  The point of the delay stuff is to prevent
background maintenance operations from eating an unreasonable share
of resources compared to foreground queries.  I don't see why you'd
put delays into queries --- if your machine is loaded, it's loaded.

I think the existing features are sufficient in this line and that
doing more is just adding complexity for complexity's sake.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings