date:20040205

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Dave Page

 -Original Message-
 From: Nicolai Tufar [mailto:[EMAIL PROTECTED] 
 Sent: 05 February 2004 00:01
 To: [EMAIL PROTECTED]
 Subject: Re: [HACKERS] PITR Dead horse? 

 Totally agree. Robustness and rock-solidness are the only 
 things missing for PostgreSQL to become the killer of certain 
 commercial enterprise databases out there.

Well I've only been using PostgreSQL since 1997 and the *only* release I
ever had problems with was 6.3.2. We also use(d) Informix SE, DB2,
Unidata and SQL Server and only Informix and Unidata come close to the
robustness of PostgreSQL - and they're not the ones we need to worry
about.

Now I'm not saying we shouldn't be continually looking to improve
things, but I don't think this is quite the problem you imply.

Regards, Dave.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Dave Page

 -Original Message-
 From: Nicolai Tufar [mailto:[EMAIL PROTECTED] 
 Sent: 05 February 2004 08:15
 To: Dave Page
 Subject: RE: [HACKERS] PITR Dead horse? 

  -Original Message-
  From: Dave Page [mailto:[EMAIL PROTECTED] Well I've 
 only been 
  using PostgreSQL since 1997 and the *only* release
 I
  ever had problems with was 6.3.2. We also use(d) Informix SE, DB2, 
  Unidata and SQL Server and only Informix and Unidata come 
 close to the 
  robustness of PostgreSQL - and they're not the ones we need 
 to worry 
  about.

 Don't know. But apparently different users will have 
 different demands From a database.

Of course, but I would argue that my claim that PostgreSQL is reliable
is backed up by the lack of people posting messages like 'we had a
powercut and now my DB is hosed'.

  Now I'm not saying we shouldn't be continually looking to improve 
  things, but I don't think this is quite the problem you imply.

 For the customers I am dealing with it is quite a problem, believe me.

Do they have specific problems with the reliability of PostgreSQL then?
Perhaps you could post details of how things have gone wrong for them
(assuming you haven't already - I don't recall anything on -hackers
recently).

Regards, Dave

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Zeugswetter Andreas SB SD

 I don't think the bgwriter is going to be able to keep up with I/O bound
 backends, but I do think it can scan and set those booleans fast enough
 for the backends to then perform the writes.

As long as the bgwriter does not do sync writes (which it does not, 
since that would need a whole lot of work to be performant) it calls 
write which returns more or less at once.
So the bottleneck can only be the fsync. From those you would want 
at least one per pg disk open in parallel.

But I think it should really be left to the OS when it actually does the IO
for the writes from the bgwriter inbetween checkpoints. 
So Imho the target should be to have not much IO open for the checkpoint, 
so the fsync is fast enough, even if serial.

Andreas

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] [PATCHES] log session end - again

2004-02-05 Thread Bruce Momjian

Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  Right. And if we have sessionids we would want them logged there, I 
  think. And that would rule out anything based on xid or backend pid. 
 
 Uh, what's wrong with backend pid?  Since we fork before we start doing
 anything with a connection, it should surely be available soon enough
 for the connection log message.
 
 Larry's idea about combining PID and backend start time didn't sound too
 unreasonable to me.

Wow, like the idea too --- pid plus time_t start time of backend. 
Actully, it would be good to have star time first so you can sort
everything in order of start time.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PATCHES] log session end - again

2004-02-05 Thread Andrew Dunstan



Bruce Momjian wrote:

Tom Lane wrote:
 

Larry's idea about combining PID and backend start time didn't sound too
unreasonable to me.
   

Wow, like the idea too --- pid plus time_t start time of backend. 
Actully, it would be good to have star time first so you can sort
everything in order of start time.
 

Yes. Of course, you would need either to sort additionally on log 
timestamp or use an order-preserving sort.

Based on Larry's idea, I had in mind to provide a third escape in the 
log_line_info string (in addition to the %U and %D that I had previously 
done) of %S for sessionid, which would look something like this: 
402251fc.713f

I will start redoing this feature when the log_disconnections patch is 
dealt with.

cheers

andrew

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Bruce Momjian

Dave Page wrote:

  -Original Message-
  From: Nicolai Tufar [mailto:[EMAIL PROTECTED] 
  Sent: 05 February 2004 00:01
  To: [EMAIL PROTECTED]
  Subject: Re: [HACKERS] PITR Dead horse? 

  Totally agree. Robustness and rock-solidness are the only 
  things missing for PostgreSQL to become the killer of certain 
  commercial enterprise databases out there.

 Well I've only been using PostgreSQL since 1997 and the *only* release I
 ever had problems with was 6.3.2. We also use(d) Informix SE, DB2,
 Unidata and SQL Server and only Informix and Unidata come close to the
 robustness of PostgreSQL - and they're not the ones we need to worry
 about.

 Now I'm not saying we shouldn't be continually looking to improve
 things, but I don't think this is quite the problem you imply.

I assume he was talking about the lack of data recovery in cases of hard
drive failure --- we now require you restore from backup or use a
replicated machine/drive setup.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Tom Lane

Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
 So Imho the target should be to have not much IO open for the checkpoint, 
 so the fsync is fast enough, even if serial.

The best we can do is push out dirty pages with write() via the bgwriter
and hope that the kernel will see fit to write them before checkpoint
time arrives.  I am not sure if that hope has basis in fact or if it's
just wishful thinking.  Most likely, if it does have basis in fact it's
because there is a standard syncer daemon forcing a sync() every thirty
seconds.

That means that instead of an I/O storm every checkpoint interval,
we get a smaller I/O storm every 30 seconds.  Not sure this is a big
improvement.  Jan already found out that issuing very frequent sync()s
isn't a win.

People keep saying that the bgwriter mustn't write pages synchronously
because it'd be bad for performance, but I think that analysis is
faulty.  Performance of what --- the bgwriter?  Nonsense, the *point*
of the bgwriter is to do the slow tasks.  The only argument that has
any merit is that O_SYNC or immediate fsync will prevent us from having
multiple writes outstanding and thus reduce the efficiency of disk
write scheduling.  This is a valid point but there is a limit to how
many writes we need to have in flight to keep things flowing smoothly.

What I'm thinking now is that the bgwriter should issue frequent fsyncs
for its writes --- not immediate, but a lot more often than once per
checkpoint.  Perhaps take one recently-written unsynced file to fsync
every time it is about to sleep.  You could imagine various rules for
deciding which one to sync; perhaps the one with the most writes issued
against it since last sync.  When we have tablespaces it'd make sense to
try to distribute the syncs across tablespaces, on the assumption that
the tablespaces are probably on different drives.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Bruce Momjian

Tatsuo Ishii wrote:
  Has this been beaten to death now? Just curious if PITR was in Dev tree
  yet. Been out of the loop. TIA.
 
 I and my co workers are very interested in implementing PITR. We will
 tackle this for 7.5 if no one objects.

I have put up a PITR project page:

http://momjian.postgresql.org/main/writings/pgsql/project

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Bruce Momjian

Koichi Suzuki wrote:
 Hi, This is Suzuki from NTT DATA Intellilink.
 
 I told Bruce Momjan that I and my colleagues are interested in 
 implementing PITR in BOF in NY LW2004.  NTT's laboratory is very 
 interested in this issue and I'm planning to work with them.  I hope we 
 could cooperate.

Yes, I am going to focus on this next week when I return.  With Win32
moving along, PITR is my next big target.  I want to get things moving.
The first step is for Tom to get the PITR WAL patches in.  Then we need
to discuss what else we need and get those on the PITR project page:

http://momjian.postgresql.org/main/writings/pgsql/project

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Bruce Momjian

Nicolai Tufar wrote:
 I would like to join this effort too.
 I was afraid that people at RedHat are already
 halfway though and were to release their work 
 shortly. But it does not seem to be the case.

We are a long way away from completion:

http://momjian.postgresql.org/main/writings/pgsql/project


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Bruce Momjian

Tom Lane wrote:
 Marc G. Fournier [EMAIL PROTECTED] writes:
  Is this something large enough, like the win32 stuff, that having a side
  list for discussions is worth setting up?
 
 In terms of the amount of code to be written, I expect it's larger than
 the win32 porting effort.  And it should be mostly pretty separate from
 hacking the core backend, since most of what remains to do is writing
 external management utilities (I think).
 
 I've been dissatisfied with having the separate pgsql-hackers-win32
 list; I feel it just fragments the discussion, and people tend to end up
 crossposting to -hackers anyway.  But a separate list for PITR work
 might be a good idea despite that experience, since it seems like it'd
 be a more separable project.

I think the win32 email list has worked well.  What is has allowed is
people who want to track only win32 to get only those emails.  It
doesn't help people already on hackers because hacker input is needed.

There are currently 102 Win32 subscribers, and most are not on the
hackers list.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Bruce Momjian

Marc G. Fournier wrote:
 On Wed, 4 Feb 2004, Tatsuo Ishii wrote:
 
   I and some other developers are also interested in.
   Do you think we can work together?
 
  Sure. Why not. I think it would be practical to decide who is the
  leader of this project, though.
 
 Is this something large enough, like the win32 stuff, that having a side
 list for discussions is worth setting up?

Yes, I would like to have such a list, and will advertize it on the PITR
project page:

http://momjian.postgresql.org/main/writings/pgsql/project


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Bruce Momjian

Bruce Momjian wrote:
 Dave Page wrote:

   -Original Message-
   From: Nicolai Tufar [mailto:[EMAIL PROTECTED] 
   Sent: 05 February 2004 00:01
   To: [EMAIL PROTECTED]
   Subject: Re: [HACKERS] PITR Dead horse? 

   Totally agree. Robustness and rock-solidness are the only 
   things missing for PostgreSQL to become the killer of certain 
   commercial enterprise databases out there.

  Well I've only been using PostgreSQL since 1997 and the *only* release I
  ever had problems with was 6.3.2. We also use(d) Informix SE, DB2,
  Unidata and SQL Server and only Informix and Unidata come close to the
  robustness of PostgreSQL - and they're not the ones we need to worry
  about.

  Now I'm not saying we shouldn't be continually looking to improve
  things, but I don't think this is quite the problem you imply.

 I assume he was talking about the lack of data recovery in cases of hard
 drive failure --- we now require you restore from backup or use a
 replicated machine/drive setup.

I retract this email.  He clearly was talking about PostgreSQL
reliability, and Dave is right, it is pretty much a non-issue, though
maybe mindshare needs some help.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Shridhar Daithankar

On Thursday 05 February 2004 20:24, Tom Lane wrote:
 Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
  So Imho the target should be to have not much IO open for the checkpoint,
  so the fsync is fast enough, even if serial.

 The best we can do is push out dirty pages with write() via the bgwriter
 and hope that the kernel will see fit to write them before checkpoint
 time arrives.  I am not sure if that hope has basis in fact or if it's
 just wishful thinking.  Most likely, if it does have basis in fact it's
 because there is a standard syncer daemon forcing a sync() every thirty
 seconds.

There are other benefits of writing pages earlier even though they might not 
get synced immediately.

It would tell kernel that this is latest copy of updated buffer. Kernel VFS 
should make that copy visible to every other backend as well. The buffer 
manager will fetch the updated copy from VFS cache next time. All without 
going to disk actually..(Within the 30 seconds window of course..)

 People keep saying that the bgwriter mustn't write pages synchronously
 because it'd be bad for performance, but I think that analysis is
 faulty.  Performance of what --- the bgwriter?  Nonsense, the *point*
 of the bgwriter is to do the slow tasks.  The only argument that has
 any merit is that O_SYNC or immediate fsync will prevent us from having
 multiple writes outstanding and thus reduce the efficiency of disk
 write scheduling.  This is a valid point but there is a limit to how
 many writes we need to have in flight to keep things flowing smoothly.

Is it a valid assumption for platforms-that-postgresql-supports that a write 
call would make changes visible across processes?

 What I'm thinking now is that the bgwriter should issue frequent fsyncs
 for its writes --- not immediate, but a lot more often than once per

frequent fsyncs or frequent fsyncs per file descriptor written? I thought it 
was later.

Just a thought.

 Shridhar

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

[HACKERS] dollar quoting

2004-02-05 Thread Andrew Dunstan

What has become of the dollar quoting mechanism that we had so much 
discussion about back in August/September?

IIRC, a consensus was reached on the actual format of the quote 
delimiters (either $$ or $identifier$), and Tom had a proof of concept 
patch to the parser to handle it, but work was needed on psql, plpgsql, 
pg_dump (and pg_restore?) + docs.

Is anyone working on it?

cheers

andrew

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

[HACKERS] prepared queries

2004-02-05 Thread Andrew Dunstan

The current release of DBD::Pg contains this in the CHANGES file:

   - $dbh-prepare() rewrites the SQL statement into an internal form,
   striping out comments and whitespace, and if PostgreSQL 
 7.3 takes the
   stripped statement and passes that to Postgres' PREPARE 
statement,
   then rewrites the statement as 'EXECUTE 
DBD::PG::cached_query n
   ($1, $2, ... $n, $n+1)' for DBD::Pg's execute.
   -- Currently disabled until PREPARE works a little better

Is there a TODO here, or is it just that the Driver is a bit out of date?

Perhaps one of these items relates to the matter?:

. Allow clients to query a list of WITH HOLD cursors and prepared statements
. Allow libpq to return information about prepared queries
It's a pity if such an important client as DBD:Pg can't really use PQs. 
Is the problem that you need to supply the data types of the params at 
PREPARE time rather than discovering them from the result of a PREPARE?

cheers

andrew



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] dollar quoting

2004-02-05 Thread Jon Jensen

On Thu, 5 Feb 2004, Andrew Dunstan wrote:

 What has become of the dollar quoting mechanism that we had so much 
 discussion about back in August/September?
 
 IIRC, a consensus was reached on the actual format of the quote 
 delimiters (either $$ or $identifier$), and Tom had a proof of concept 
 patch to the parser to handle it, but work was needed on psql, plpgsql, 
 pg_dump (and pg_restore?) + docs.
 
 Is anyone working on it?

I am, but not very quickly. I hope to have some time in the next month, 
but if someone else beats me to it I'll just be happy it got done.

Jon

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Marc G. Fournier


[EMAIL PROTECTED]

I set myself as owner, since I didn't figure it was something you really
needed added to your plate? :)  Just means you don't have to go through
and do the Approvals for postings when they need it, I'll just do it as my
normal stuff ...

On Thu, 5 Feb 2004, Bruce Momjian wrote:

 Marc G. Fournier wrote:
  On Wed, 4 Feb 2004, Tatsuo Ishii wrote:
 
I and some other developers are also interested in.
Do you think we can work together?
  
   Sure. Why not. I think it would be practical to decide who is the
   leader of this project, though.
 
  Is this something large enough, like the win32 stuff, that having a side
  list for discussions is worth setting up?

 Yes, I would like to have such a list, and will advertize it on the PITR
 project page:

   http://momjian.postgresql.org/main/writings/pgsql/project


 --
   Bruce Momjian|  http://candle.pha.pa.us
   [EMAIL PROTECTED]   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Tom Lane

Shridhar Daithankar [EMAIL PROTECTED] writes:
 There are other benefits of writing pages earlier even though they might not 
 get synced immediately.

Such as?

 It would tell kernel that this is latest copy of updated buffer. Kernel VFS 
 should make that copy visible to every other backend as well. The buffer 
 manager will fetch the updated copy from VFS cache next time. All without 
 going to disk actually..(Within the 30 seconds window of course..)

This seems quite irrelevant given the way we handle shared buffers.

 frequent fsyncs or frequent fsyncs per file descriptor written? I thought it 
 was later.

You can only fsync one FD at a time (too bad ... if there were a
multi-file-fsync API it'd solve the overspecified-write-ordering issue).

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] dollar quoting

2004-02-05 Thread Tom Lane

Andrew Dunstan [EMAIL PROTECTED] writes:
 IIRC, a consensus was reached on the actual format of the quote 
 delimiters (either $$ or $identifier$), and Tom had a proof of concept 
 patch to the parser to handle it, but work was needed on psql, plpgsql, 
 pg_dump (and pg_restore?) + docs.

I think someone has to fix psql before we can consider applying the
backend patch.  Fixing the other stuff can come after.

 Is anyone working on it?

I kinda thought you had volunteered to work on the psql part...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] dollar quoting

2004-02-05 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
 

IIRC, a consensus was reached on the actual format of the quote 
delimiters (either $$ or $identifier$), and Tom had a proof of concept 
patch to the parser to handle it, but work was needed on psql, plpgsql, 
pg_dump (and pg_restore?) + docs.
   

I think someone has to fix psql before we can consider applying the
backend patch.  Fixing the other stuff can come after.
Makes sense.

 

Is anyone working on it?
   

I kinda thought you had volunteered to work on the psql part...

 



I don't recall being that specific, but you could be right. In any case, 
I didn't want to trip over anyone else, which is why I asked.

I will try to coordinate with Jon.

cheers

andrew

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Austin Gonyou

Wow. What a wonderful response. Thanks all!


On Thu, 2004-02-05 at 08:57, Bruce Momjian wrote:
 Tatsuo Ishii wrote:
   Has this been beaten to death now? Just curious if PITR was in Dev tree
   yet. Been out of the loop. TIA.
  
  I and my co workers are very interested in implementing PITR. We will
  tackle this for 7.5 if no one objects.
 
 I have put up a PITR project page:
 
   http://momjian.postgresql.org/main/writings/pgsql/project
-- 
Austin Gonyou [EMAIL PROTECTED]
Coremetrics, Inc.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Zeugswetter Andreas SB SD


 People keep saying that the bgwriter mustn't write pages synchronously
 because it'd be bad for performance, but I think that analysis is
 faulty.  Performance of what --- the bgwriter?  Nonsense, the *point*

Imho that depends on the workload. For a normal OLTP workload this is 
certainly correct. I do not think it is correct for mass loading,
or an otherwise IO bound db.

 of the bgwriter is to do the slow tasks.  The only argument that has
 any merit is that O_SYNC or immediate fsync will prevent us from having
 multiple writes outstanding and thus reduce the efficiency of disk
 write scheduling.  This is a valid point but there is a limit to how
 many writes we need to have in flight to keep things flowing smoothly.

But that is imho the main point. The difference for modern disks
is 1Mb/s for random 8k vs. 20 Mb/s for random 256k.

Don't understand me wrong I think sync writing would achieve maximum performance, 
but you have to try to write physically adjacent 256k, and you need a vague
idea which blocks to write in parallel. And since that is not so easy I think 
we could leave it to the OS.

And as an aside I think 20-30 minute checkpoint intervals would be sufficient
with a bgwriter.

Andreas

Ps: don't most syncers have 60s intervals, not 30 ?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Nicolai Tufar

 -Original Message-
 From: Dave Page [mailto:[EMAIL PROTECTED]
 Sent: Thursday, February 05, 2004 11:02 AM
 To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Subject: RE: [HACKERS] PITR Dead horse?
 Of course, but I would argue that my claim that PostgreSQL is reliable
 is backed up by the lack of people posting messages like 'we had a
 powercut and now my DB is hosed'.

It's not like that. It's more like 'what will happen if we had a
powercut/
disk failure/cpu failure/memory failure, etc, etc.' and that answer I
have
to give is 'why, there is PITR of course!'. No other answer will pass in
enterprise world. Those people are not open-minded, they'd rather be
safe
than sorry.

 Do they have specific problems with the reliability of PostgreSQL
then?
 Perhaps you could post details of how things have gone wrong for them
 (assuming you haven't already - I don't recall anything on -hackers
 recently).

Nothing remarkable. PostgreSQL just works. Bu as I said before,
In enterprise world, good sleep at night is treasured above all.

 Regards, Dave

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Dave Page

 -Original Message-
 From: Nicolai Tufar [mailto:[EMAIL PROTECTED] 
 Sent: 05 February 2004 17:35
 To: Dave Page; [EMAIL PROTECTED]
 Subject: RE: [HACKERS] PITR Dead horse? 

  -Original Message-
  From: Dave Page [mailto:[EMAIL PROTECTED]
  Sent: Thursday, February 05, 2004 11:02 AM
  To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
  Subject: RE: [HACKERS] PITR Dead horse?
  Of course, but I would argue that my claim that PostgreSQL 
 is reliable 
  is backed up by the lack of people posting messages like 'we had a 
  powercut and now my DB is hosed'.

 It's not like that. It's more like 'what will happen if we 
 had a powercut/ disk failure/cpu failure/memory failure, etc, 
 etc.' and that answer I have to give is 'why, there is PITR 
 of course!'. No other answer will pass in enterprise world. 
 Those people are not open-minded, they'd rather be safe than sorry.

Ahh, that's not quite what I thought you meant. It sounded like you were
questioning the reliability of PostgreSQL, not it's ability to be
recovered to point of failure.

  Do they have specific problems with the reliability of PostgreSQL
 then?
  Perhaps you could post details of how things have gone 
 wrong for them 
  (assuming you haven't already - I don't recall anything on -hackers 
  recently).

 Nothing remarkable. PostgreSQL just works. Bu as I said 
 before, In enterprise world, good sleep at night is treasured 
 above all.

My SQL2K servers give me far more sleepless nights than PostgreSQL ever
did!

Regards, Dave.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Tom Lane

A long time ago Vadim proposed that we should revise smgr.c's API so
that it does not depend on Relations (relcache entries); rather, only
a RelFileNode value should be needed to access a file in smgr and lower
levels.  This would allow us to get rid of the concept of blind
writes.  As of CVS tip, with most writes being done by the bgwriter,
most writes are blind.  Try tracing the bgwriter: every write involves
a file open and close, which has got to be hurting its performance.
The same has been true for awhile now of the background checkpointer.

I'm motivated to do something about this today because I'm reviewing the
PITR patch that J.R. Nield and Patrick Macdonald were working on, and
one of the things I don't care for is the kluges it adds to the smgr API
to work around lack of a Relation in some operations.

I propose the following revisions:

* smgr.c should internally keep a hashtable (indexed by RelFileNode)
with struct SMgrRelation entries for each relation that has recently
been accessed.  File access information for the lower-level modules
(md.c, fd.c) would appear in this hashtable entry.

* smgropen takes a RelFileNode and returns a pointer to a hashtable
entry, which is used as the argument to all other smgr operations.
smgropen itself does not cause any actual I/O; thus it doesn't matter
if the file doesn't exist yet.  smgropen followed by smgrcreate is the
way to create a new relation on disk.  Without smgrcreate, file
existence will be checked at the first read or write attempt.

* smgrclose closes the md.c-level file and drops the hashtable entry.
Hashtable entries remain valid unless explicitly closed (thus, multiple
smgropens for the same file are legal).

* The rd_fd field of relcache nodes will be replaced by an SMgrRelation *
pointer, so that callers having a Relation reference can access the smgr
file easily.  (In particular, this means that most calls to the bufmgr
will still pass Relations, and we don't need to propagate the API change
up to the bufmgr level.)

* The existing places in the bufmgr that do RelationNodeCacheGetRelation
followed by either regular or blind write would be replaced by smgropen
followed by smgrwrite.  Since smgropen is just a hash table lookup, it
is no more expensive than the RelationNodeCacheGetRelation it replaces.
Note these places would *not* do smgrclose afterwards.

* Because we don't smgrclose after a write, it is possible to have
dangling smgr entries that aren't useful any more, as well as open
file descriptors underneath them.  This isn't too big a deal on Unix
but it will be fatal for the Windows port, since it would prevent a
DROP TABLE if some other backend happens to have touched the table.
What I propose to do about this is:
  1. In the bgwriter, at each checkpoint do smgrcloseall to close
 all open files.
  2. In regular backends, receipt of a relcache flush message will
 result in smgrclose(), even if there is not a relcache entry
 for the given relation ID.  A global cache flush event (sinval
 buffer overflow) causes smgrcloseall.
This ensures that open descriptors for a dead relation will not persist
past the next checkpoint.  We had already agreed, I think, that the
best solution for Windows' problem with deleting open files is to retry
pending deletes at checkpoint.  This smgr rewrite will not contribute
to actually making that happen, but it won't make things worse either.

* I'm going to get rid of the which argument that once selected a
particular smgr implementation (md.c vs mm.c).  It's vestigial at
present and is misdefined anyway.  If we were to resurrect the concept
of multiple storage managers, we'd need the selection to be determinable
from the RelFileNode value, rather than being a separate argument.
(When we add tablespaces, we could imagine associating smgrs with
tablespaces, but it would have to be the responsibility of smgr.c to
determine which smgr to use based on the tablespace ID.)

* We can remove the dummy cache support in relcache.c, as well as
RelationNodeCacheGetRelation() and the hashtable that supports it.

* The smgr hashtable entries could possibly be used for recording other
status; for instance keeping track of which relations need fsync in the
bgwriter.  I'm also thinking of merging smgr.c's existing
list-of-rels-to-be-deleted into this data structure.

* AFAICS the only downside of not having a Relation available in smgr.c
and md.c is that error messages could only refer to the RelFileNode
numbers and not to the relation name.  I'm not sure this is bad, since
in my experience what you want to know about such errors is the actual
disk filename, which RelFileNode tells you and relation name doesn't.
We could preserve the current behavior by passing the relation name to
smgropen when available, and saving the name in struct SMgrRelation.
But I'm inclined not to.

Comments?

regards, tom lane

---(end of broadcast)---

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Tom Lane

Dave Page [EMAIL PROTECTED] writes:
 Ahh, that's not quite what I thought you meant. It sounded like you were
 questioning the reliability of PostgreSQL, not it's ability to be
 recovered to point of failure.

I think the waters got muddied a bit by the suggestion elsewhere in the
thread (not from Nicolai, IIRC) that we needed a mailing list to talk
about reliability issues in general.  We know we need PITR to help us
become a more credible enterprise-grade database; so that discussion is
short and sweet.  What people were confused about was whether there was
enough other issues to need ongoing discussion.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Nicolai Tufar

 -Original Message-
 From: Dave Page [mailto:[EMAIL PROTECTED]
 My SQL2K servers give me far more sleepless nights than PostgreSQL
ever
 did!

You bet! I totally agree with you.
Technicians like you, me and most people on this list
Already know that PostgreSQL is stable and reliable.
It is management that needs to be convinced, and for this
we need to have PITR in feature list.

 Regards, Dave.

Regards,
Nicolai


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] PITR Dead horse?

2004-02-05 Thread Austin Gonyou

On Thu, 2004-02-05 at 14:00, Nicolai Tufar wrote:
  -Original Message-
  From: Dave Page [mailto:[EMAIL PROTECTED]
  My SQL2K servers give me far more sleepless nights than PostgreSQL
 ever
  did!
 
 You bet! I totally agree with you.
 Technicians like you, me and most people on this list
 Already know that PostgreSQL is stable and reliable.
 It is management that needs to be convinced, and for this
 we need to have PITR in feature list.
 
  Regards, Dave.


As previously stated by Bruce I believe, the mindshare department needs
some work. For this, the PITR is a necessity, but also when comparing
features with other DBs that people and businesses are currently
familiar with.

-- 
Austin Gonyou [EMAIL PROTECTED]
Coremetrics, Inc.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Why has postmaster shutdown gotten so slow?

2004-02-05 Thread Jan Wieck

Jan Wieck wrote:

Tom Lane wrote:

Shutdown of an idle postmaster used to take about two or three seconds
(mostly due to the sync/sleep(2)/sync in md_sync).  For the last couple
of days it's taking more like a dozen seconds.  I presume somebody broke
something, but I'm unsure whether to pin the blame on bgwriter or
Windows changes.  Anyone care to fess up?
I guess it could well be the bgwriter, which when having nothing to do 
at all is sleeping for 10 seconds. Not sure, will check.
I checked the background writer for this and I can not reproduce the 
behaviour. If the bgwriter had zero blocks to write it does PG_USLEEP 
for 10 seconds, which on Unix is done by select() and that is correctly 
interrupted when the postmaster sends it the term signal on shutdown.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Recursive queries?

2004-02-05 Thread evgen

hello hackers,

some time ago i played with PgSQL and have written simpliest working 
prototype of WITH clause for it.
it don't do any checks and performs only simpliest selects, but it works.
i can contribute it and develop it further to production state.

regards,
.evgen
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

[HACKERS] Preventing duplicate vacuums?

2004-02-05 Thread Josh Berkus

Folks,

Just occurred to me that we have no code to prevent a user from running two 
simultaneos lazy vacuums on the same table.I can't think of any 
circumstance why running two vacuums would be desirable behavior; how 
difficult would it be to make this an exception?

This becomes a more crucial issue now since the introduction of vacuum_delay 
makes overlapping vacuums more probable.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Marc G. Fournier

On Thu, 5 Feb 2004, Tom Lane wrote:

 * Because we don't smgrclose after a write, it is possible to have
 dangling smgr entries that aren't useful any more, as well as open
 file descriptors underneath them.  This isn't too big a deal on Unix
 but it will be fatal for the Windows port, since it would prevent a
 DROP TABLE if some other backend happens to have touched the table.
 What I propose to do about this is:
   1. In the bgwriter, at each checkpoint do smgrcloseall to close
  all open files.
   2. In regular backends, receipt of a relcache flush message will
  result in smgrclose(), even if there is not a relcache entry
  for the given relation ID.  A global cache flush event (sinval
  buffer overflow) causes smgrcloseall.
 This ensures that open descriptors for a dead relation will not persist
 past the next checkpoint.  We had already agreed, I think, that the
 best solution for Windows' problem with deleting open files is to retry
 pending deletes at checkpoint.  This smgr rewrite will not contribute
 to actually making that happen, but it won't make things worse either.

'k, only comment is on this one ... would it not be a bit more efficient
to add a flag to the SMgrRelation * structure that acts as a timer?  if
-1, then relation is deleted, is 0, make it epoch of the time that
Relation was last accessed?  then your 'smgrcloseall', instead of closing
all files (even active/undeleted ones), would only close those that are
either idle for x minutes (maybe a postgresql.conf tunable there?) or
those that have been DROP'd?  Assuming, that is, that your end goal is to
reduce the overall # of open/closes of files ... this way, instead of
closing 20 just to reopen 19 of them, you only close that one that needs
it ...





Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Tom Lane

Marc G. Fournier [EMAIL PROTECTED] writes:
 'k, only comment is on this one ... would it not be a bit more efficient
 to add a flag to the SMgrRelation * structure that acts as a timer?

Hm, we could try that, although I'm not sure it would help much.  You'd
have to set the timeout to be longer than a checkpoint interval to make
any difference.

In the back of my mind is the thought that the Windows guys are going to
end up passing file-delete requests over to the bgwriter anyway, which
would largely eliminate the issue --- the bgwriter would know which
files need to be sgmrclose'd and wouldn't have to do smgrcloseall.
(If they don't do this, how are they going to cope with backends that
exit before their file deletion is completed?)

I'll do it the easy way for now and we can refine it after we see what
the file-close solution for Windows ends up looking like.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Preventing duplicate vacuums?

2004-02-05 Thread Rod Taylor

On Thu, 2004-02-05 at 15:37, Josh Berkus wrote:
 Folks,
 
 Just occurred to me that we have no code to prevent a user from running two 
 simultaneos lazy vacuums on the same table.I can't think of any 
 circumstance why running two vacuums would be desirable behavior; how 
 difficult would it be to make this an exception?

You have a 8 billion row table with some very high turn over tuples
(lots of updates to a few thousand rows). A partial or targeted vacuum
would be best, failing that you kick them off fairly frequently,
especially if IO isn't really an issue.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Alvaro Herrera

On Thu, Feb 05, 2004 at 02:05:46PM -0500, Tom Lane wrote:

 * smgrclose closes the md.c-level file and drops the hashtable entry.
 Hashtable entries remain valid unless explicitly closed (thus, multiple
 smgropens for the same file are legal).

So, will there be a refcount on each cache entry?  Else, how would
smgrclose when to drop the cache entry if there had been multiple
smgropen calls to the same file?  (unless, of course, each smgropen call
yields a different hash entry?)


 I'm also thinking of merging smgr.c's existing
 list-of-rels-to-be-deleted into this data structure.

Please don't.  In the nested transaction environment, each subxact has
to keep track of which files should be deleted.  That's why I was trying
to set the list inside a transaction status stack.

Another way to do it would be keeping the list of files to delete along
with the deleting Xid, but that would also require keeping a list of
which xacts aborted and which ones didn't.

 * AFAICS the only downside of not having a Relation available in smgr.c
 and md.c is that error messages could only refer to the RelFileNode
 numbers and not to the relation name.  I'm not sure this is bad, since
 in my experience what you want to know about such errors is the actual
 disk filename, which RelFileNode tells you and relation name doesn't.

I agree it's best to have the filename in the error message, but IMHO
the relation name should also be presented to the user for clarity.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Endurecerse, pero jamás perder la ternura (E. Guevara)

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Preventing duplicate vacuums?

2004-02-05 Thread Josh Berkus

Rod,

 You have a 8 billion row table with some very high turn over tuples
 (lots of updates to a few thousand rows). A partial or targeted vacuum
 would be best, failing that you kick them off fairly frequently,
 especially if IO isn't really an issue.

Yes, but we don't have partial or targeted vacuums yet.  When tablespaces is 
finished, presumably the lock would be per tablespace.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] [PATCHES] log session end - again

2004-02-05 Thread Andrew Dunstan



Chester Kustarz wrote:

On Thu, 5 Feb 2004, Bruce Momjian wrote:
 

Wow, like the idea too --- pid plus time_t start time of backend.
Actully, it would be good to have star time first so you can sort
everything in order of start time.
   

Why not just add a printf like pattern so the user can output
whatever they like?
%p - backend pid
%t - connect time
%C - standard connection string %t:%p
%a - autoincrement number

log_line_info where this would be implemented is indeed done with a 
printf like string, which is easily extensible The following is adapted 
from the description of the original patch which it is my intention to 
bring up to date soon and resubmit:

   Prefixes each message in the server log file or syslog with extra
   information as specified in the string. The default is an empty
   string. The following sequences in the string are translated:
   '%U' is replace with the username,
   '%D' is replaced by the database name,
   and '%%' is replaced by '%'.
   All other characters are copied literally to the log. For example,
  log_line_info = '%U%%%D '
   will produce output like this:
  myuser%mydb LOG:  query: create table foo(blah text);
cheers

andrew





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Preventing duplicate vacuums?

2004-02-05 Thread Josh Berkus

Tom,

 Yes we do: there's a lock.

Sorry, bad test.  Forget I said anything.

Personally, I would like to have the 2nd vacuum error out instead of blocking.  
However, I'll bet that a lot of people won't agree with me.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Tom Lane

Alvaro Herrera [EMAIL PROTECTED] writes:
 On Thu, Feb 05, 2004 at 02:05:46PM -0500, Tom Lane wrote:
 * smgrclose closes the md.c-level file and drops the hashtable entry.
 Hashtable entries remain valid unless explicitly closed (thus, multiple
 smgropens for the same file are legal).

 So, will there be a refcount on each cache entry?

No, I deliberately intended not.  There is at most one persistent
reference to an smgr hashtable entry; that's the one in the relcache
entry for the relation.  The only caller of smgrclose will be
relcache.c, and it'll do it only when clearing the relcache reference
(or, in the sinval case, upon determining that there is no relcache
entry for the relation).  Adding a reference count would just complicate
matters --- for instance, we'd have to be able to reset them after an
error, which would be rather hard from smgr's point of view since it
really shouldn't be looking into the relcache to see if there's a
reference there.

 I'm also thinking of merging smgr.c's existing
 list-of-rels-to-be-deleted into this data structure.

 Please don't.  In the nested transaction environment, each subxact has
 to keep track of which files should be deleted.  That's why I was trying
 to set the list inside a transaction status stack.

Hm.  Okay, I'll leave that separate for now, although I think we do want
to merge the structures eventually.

 Another way to do it would be keeping the list of files to delete along
 with the deleting Xid, but that would also require keeping a list of
 which xacts aborted and which ones didn't.

Really?  When a subtransaction finishes, you either delete its files
immediately (ie, if it aborted, you can drop files it created) or you
reassign them as the responsibility of the parent (since any effects of
the subtransaction really depend on whether the parent commits, you
can't actually do its deletes yet).  I don't see the need to remember
subtransaction state further than that.  So ISTM all you need is one
field added to the existing list entries to remember which
subtransaction is currently on the hook for a given file-deletion
request.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Alvaro Herrera

On Thu, Feb 05, 2004 at 04:55:44PM -0500, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:

  Another way to do it would be keeping the list of files to delete along
  with the deleting Xid, but that would also require keeping a list of
  which xacts aborted and which ones didn't.
 
 Really?  When a subtransaction finishes, you either delete its files
 immediately (ie, if it aborted, you can drop files it created) or you
 reassign them as the responsibility of the parent (since any effects of
 the subtransaction really depend on whether the parent commits, you
 can't actually do its deletes yet).

Right, that's the mechanism I originally envisioned.  (A similar one is
needed for deferred triggers, async notifies, heavy locks and on commit
actions.)

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Y dijo Dios: Que sea Satanás, para que la gente no me culpe de todo a mí.
Y que hayan abogados, para que la gente no culpe de todo a Satanás

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Marc G. Fournier

On Thu, 5 Feb 2004, Tom Lane wrote:

 Marc G. Fournier [EMAIL PROTECTED] writes:
  'k, only comment is on this one ... would it not be a bit more efficient
  to add a flag to the SMgrRelation * structure that acts as a timer?

 Hm, we could try that, although I'm not sure it would help much.  You'd
 have to set the timeout to be longer than a checkpoint interval to make
 any difference.

Why?  Setting it to the checkpoint interval itself should be sufficient,
no?  All you want to do is avoid closing any files that were used during
that last checkpoint interval, since there is a good chance you'd have to
once more reopen them in the checkpoint interval ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Tom Lane

Marc G. Fournier [EMAIL PROTECTED] writes:
 Why?  Setting it to the checkpoint interval itself should be sufficient,
 no?  All you want to do is avoid closing any files that were used during
 that last checkpoint interval, since there is a good chance you'd have to
 once more reopen them in the checkpoint interval ...

If we did that then (on Windows) every DROP TABLE would take one extra
checkpoint interval to take effect in terms of freeing disk space.
Not sure if this is a good tradeoff for avoiding some file opens.

In any case, I think we should leave it to be debated after we see what
the Windows file-closing solution turns out to be.  It might become a
non-issue.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Tom Lane

Marc G. Fournier [EMAIL PROTECTED] writes:
 k, but that would be a different scenario, no?  As I mentioned in my
 original, a DROP TABLE would reset its timeout to -1, meaning to close it
 and drop it on the next checkpoint interval ...

How would it do that?  These structs are local to particular backends,
so there's no way for a DROP TABLE occurring in one backend to reach
over and reset the timeout in the bgwriter process.  If we add
communication that lets the bgwriter know the table is being dropped,
then the problem is solved directly.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] It's past time to redo the smgr API

2004-02-05 Thread Christopher Kings-Lynne

* AFAICS the only downside of not having a Relation available in smgr.c
and md.c is that error messages could only refer to the RelFileNode
numbers and not to the relation name.  I'm not sure this is bad, since
in my experience what you want to know about such errors is the actual
disk filename, which RelFileNode tells you and relation name doesn't.
We could preserve the current behavior by passing the relation name to
smgropen when available, and saving the name in struct SMgrRelation.
But I'm inclined not to.
Comments?
That all sounds pretty nice.  From my point of view I recall you saying 
that this would need to be done for tablespaces a long time ago - so I 
just request that the rewrite be done with future tablespaces in mind :)

Chris

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] Vacuum Delay feature

2004-02-05 Thread Jan Wieck

The attached patch applies to CVS tip as of 02/05/2004 and implements 
the cost based vacuum delay feature.

A detailed description with charts of different configuration settings 
can be found here:

http://developer.postgresql.org/~wieck/vacuum_cost/

There is a problem left that seems to be related to Toms observations in 
the shutdown behaviour of the postmaster. My current guess is that the 
napping done via select(2) somehow prevents responding to the query 
abort signal.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #


vacuum_cost.75devel.diff.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Vacuum Delay feature

2004-02-05 Thread Jan Wieck

Attached is a corrected version that solves the query cancel problem by 
not napping any more and going full speed as soon as any signal is 
pending. If nobody objects, I'm going to commit this tomorrow.

Jan

Jan Wieck wrote:

The attached patch applies to CVS tip as of 02/05/2004 and implements 
the cost based vacuum delay feature.

A detailed description with charts of different configuration settings 
can be found here:

 http://developer.postgresql.org/~wieck/vacuum_cost/

There is a problem left that seems to be related to Toms observations in 
the shutdown behaviour of the postmaster. My current guess is that the 
napping done via select(2) somehow prevents responding to the query 
abort signal.

Jan



--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #


vacuum_cost.75devel.diff.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 8: explain analyze is your friend

47 matches

Mail list logo