Re: [pgsql-hackers-win32] [HACKERS] Weird new time zone

2004-07-21 Thread Tom Lane
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
>> Right.  The problem we are actually faced with is to identify 
>> which of the zic timezones is the best match for the system's 
>> timezone setting.
>> One of the issues is that it's not clear what "best" means...
>> 
>> At the moment I like Oliver Jowett's idea of defining "best" 
>> as "the one that matches furthest back".

> Sounds reasonable to me. As long as a clear warning is put in the log
> whenever something is picked that is not a perfect match, so the admin
> is directed at the potential problem and can fix it (by setting the GUC
> timezone variable).

I'm not sure that a log entry is needed --- SHOW TIMEZONE will make it
perfectly clear what zone was selected.

But in any case, I've committed code that implements Oliver's idea.
Could folks take another swipe at it and see if it works well in their
local zones?  Also, it'd still be interesting to see if we could #ifdef
out the matching on zone names for Windows and still get reasonable
results.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] Completed TODO item?

2004-07-21 Thread Gavin Sherry
* Have psql show more information about sequences

template1=# \d foo_seq
Sequence "public.foo_seq"
Column |  Type
---+-
 sequence_name | name
 last_value| bigint
 increment_by  | bigint
 max_value | bigint
 min_value | bigint
 cache_value   | bigint
 log_cnt   | bigint
 is_cycled | boolean
 is_called | boolean

That item seems to be done or have I missed something?

Gavin

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Sorting out acl fixes

2004-07-21 Thread Christopher Kings-Lynne
When did that get to be part of the requirements?  I don't even know
who you expect to do this (backend? pg_dump? user?) or at what level
you think the fixing should happen (GRANT/REVOKE?  UPDATE pg_class
SET relacl = fixme(relacl)?  direct hacking of the ACL array?).  To
say nothing of the semantic problems of deciding what an invalid
ACL is really supposed to mean.
I was referring to fixing my own database that is full of these acls 
that dump incorrectly - perhaps you don't give me enough credit.  I'm 
thinking that if I can find a watertight way of fixing it at pg_dump 
time I should, for pre 7.5 databases.  Should I?

Chris
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Sorting out acl fixes

2004-07-21 Thread Tom Lane
Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:
>> The solution I had in mind was for ALTER OWNER to run through the ACL
>> and replace the old owner ID with the new one wherever the old one
>> appears, in both grantor and grantee positions.

> What about fixing existing bad acls?

When did that get to be part of the requirements?  I don't even know
who you expect to do this (backend? pg_dump? user?) or at what level
you think the fixing should happen (GRANT/REVOKE?  UPDATE pg_class
SET relacl = fixme(relacl)?  direct hacking of the ACL array?).  To
say nothing of the semantic problems of deciding what an invalid
ACL is really supposed to mean.

I'll be satisfied if ALTER OWNER does not transform a valid
configuration into an invalid one.  Right now it fails to meet that
minimal requirement.  Considering we are weeks past feature freeze,
I don't want to get into inventing a magic wand that can fix existing
breakage automatically.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Sorting out acl fixes

2004-07-21 Thread Christopher Kings-Lynne
The solution I had in mind was for ALTER OWNER to run through the ACL
and replace the old owner ID with the new one wherever the old one
appears, in both grantor and grantee positions.  So in your example
{chriskl=arwdRxt/chriskl,other=r/chriskl}
becomes
{gumby=arwdRxt/gumby,other=r/gumby}
You could skip doing this when the ACL is null of course, since the
default assumption about its contents will change in just the same way.
What about fixing existing bad acls?  I can't figure out a grant or 
revoke statement to do it?  Do I have to update to set the relacl to 
null and then re-run the fixed set of grants?

Chris
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Sorting out acl fixes

2004-07-21 Thread Tom Lane
Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:
> 1. Should we make the owner aclitem NEVER appear in the acl list?  ie. 
> when we do the first grant on an object, we don't put in a default acl 
> for the owner.  Instead we special case the aclcheck to always allow the 
> owner full privilieges?

That would be *entirely* unacceptable, since it would for example
prevent the owner from making the table read-only to himself.
I think that's an important feature to preserve.

The solution I had in mind was for ALTER OWNER to run through the ACL
and replace the old owner ID with the new one wherever the old one
appears, in both grantor and grantee positions.  So in your example
{chriskl=arwdRxt/chriskl,other=r/chriskl}
becomes
{gumby=arwdRxt/gumby,other=r/gumby}

You could skip doing this when the ACL is null of course, since the
default assumption about its contents will change in just the same way.

The minimum you could safely do is make this replacement in every
place where the old owner appears as a grantor, but leave grantees
alone.  This rule produces

{chriskl=arwdRxt/gumby,other=r/gumby}

Now IMHO this would be an utterly bizarre behavior ... but it would
at least produce a legal, consistent state of the ACL, in which every
granted right is traceable back to the new owner's implicit grant
options.  If the new owner gumby didn't want chriskl to have those
permissions, he'd at least be able to revoke 'em.  One would think
though that the first alternative is much more likely to be what
people would expect.

>{chriskl=arwdRxt/gumby,other=r/chriskl}

When gumby is the owner, this is an illegal ACL: chriskl is granting
rights he doesn't have grant option for.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] check point segments leakage ?

2004-07-21 Thread Tom Lane
>> ...
>> Would that prevent VACUUM from doing its work? It should be able to
>> check the last startup xid to check that isn't the case, but suppose a
>> backend had exited without taking down the postmaster.

There is no such thing as a backend crashing without the postmaster
noticing (at least not unless your kernel is seriously broken).

It is entirely possible for a backend not to log xact commit or abort,
though --- in fact I think that is the normal case for a read-only
transaction (no point in writing a clog entry if no one will ever
consult it, eh?).  This is not unsafe because the actual logic for
such things is:

1. Transaction still running?  (check shared memory PGPROC array to
   see if any backend claims to be running it)

2. Transaction committed or aborted according to pg_clog?

3. If none of the above, it must have crashed --- mark it aborted in
   pg_clog.

Also, VACUUM's pruning logic does not depend at all on whether individual
transactions are still running or not.  The issue there is the oldest
xid that is still shown as running in the shared-memory PGPROC array.
AFAIK this is highly reliable.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] miniscule compiler barf in pg_ctl.c

2004-07-21 Thread Tom Lane
"Dann Corbit" <[EMAIL PROTECTED]> writes:
>   default:
> ;  /* << Just a semicolon added here  */
>/* assert(false); */
>  }

Personally I prefer writing

default:
break;

Switch branches that don't have break or return at the end are trouble
waiting to happen, compiler glitches or no ...

Patched, thanks!

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Sorting out acl fixes

2004-07-21 Thread Christopher Kings-Lynne
1. Should we make the owner aclitem NEVER appear in the acl list?  ie. 
when we do the first grant on an object, we don't put in a default acl 
for the owner.  Instead we special case the aclcheck to always allow the 
owner full privilieges?  Also, if the first grant was 'select' for the 
'other' user, and then we changed the owner to the 'other' user, should 
we erase the 'other' user's aclitem?
I forgot to mention - under this schema grants/revokes to the owner 
become no-ops.

Chris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Missing header in zic.c?

2004-07-21 Thread Tom Lane
"Dann Corbit" <[EMAIL PROTECTED]> writes:
> When compiling zic.c, the definition for optarg was missing (obviously
> getopt.h was missing).  Adding getopt.h to zic.c solved that problem.

Fixed --- thanks.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Why we really need timelines *now* in PITR

2004-07-21 Thread Christopher Kings-Lynne
That gives us enough to talk through and begin some testing.
Anybody have any other horror stories, bring 'em on.
I think that the PITR docs will have to be written in two sections.  One 
will need to be a pure reference that orthogonally describes the 
options, etc.  The other section will need to be a scenario-based 
explanation of what to do/how to recover in all the major different 
failure patterns.  It's the only way people (I!) will understand it all.

From my point of view, what I need PITR to be able to do is allow me to 
restore to any point in the 24 hour period between pg_dumpalls.  I also 
need to know what the exact criteria for deleting archived logs every 24 
hours, and how that can be determined automatically in a script 
(checking the pg_dumpall end-of-log marker exists as well).  I need to 
be told to copy, not move the logs.  Also, I need to be sure that 
pg_dumpall is enough, and I don't need to make sure I issue a checkpoint 
before the pg_dumpall or anything.

Chris
---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] Sorting out acl fixes

2004-07-21 Thread Christopher Kings-Lynne
I'm playing around trying to fix the problem where ACLs enter an illegal 
state when you change the owner of a table, say.

This is the current situation:
test=# create user gumby;
CREATE USER
test=# create user other;
CREATE USER
test=# create table test (a int4);
CREATE TABLE
test=# grant select on test to other;
GRANT
test=# alter table test owner to gumby;
ALTER TABLE
test=# \dp
   Access privileges for database "test"
 Schema | Name | Type  | Access privileges
+--+---+---
 public | test | table | {chriskl=arwdRxt/chriskl,other=r/chriskl}
(1 row)
test=# \dt
   List of relations
 Schema | Name | Type  | Owner
+--+---+---
 public | test | table | gumby
(1 row)
Now, the chriskl user's old owner privs are still there, but are granted 
by chriskl still.  The initial fix would be to modify the acl to be like 
this after owner change:

{chriskl=arwdRxt/gumby,other=r/chriskl}
Perhaps even:
{gumby=arwdRxt/chriskl,other=r/gumby}
But there's a few other options:
1. Should we make the owner aclitem NEVER appear in the acl list?  ie. 
when we do the first grant on an object, we don't put in a default acl 
for the owner.  Instead we special case the aclcheck to always allow the 
owner full privilieges?  Also, if the first grant was 'select' for the 
'other' user, and then we changed the owner to the 'other' user, should 
we erase the 'other' user's aclitem?

2. Should we just whenever the owner is changed, change all grantors 
that are the old owner to the new owner?

3. Should we do (2) but only when the grantor is the old owner and the 
grantee is the old owner?

Is there a logical way of determining which of these is correct?
Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] PreallocXlogFiles

2004-07-21 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> Yes, I agree, but the checkpointer isn't waking up often enough
> currently to do this effectively. It's just randomly doing it.

Agreed.  Maybe it should be part of the bgwriter's idle loop, and
not directly associated with checkpoints at all.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Why we really need timelines *now* in PITR

2004-07-21 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> PITR should cope with these scenarios. These are described reasonably
> closely but not as exact mechanical tests, so as to ensure that if
> multiple solutions exist to these recovery scenarios that all paths are
> tested.

> [ snip ... ]

Now *my* head is hurting ;-)

AFAICS, we should be able to handle all of these cases, as long as
(a) the DBA doesn't manually delete any xlog files (else he probably
loses the chance to recover to the associated timeline), and
(b) the DBA doesn't screw up on picking which timeline to recover to.

I'd guess that (b) is likely to be the bigger threat :-(.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] PreallocXlogFiles

2004-07-21 Thread Simon Riggs
On Thu, 2004-07-22 at 00:35, Gavin Sherry wrote:
> On Wed, 21 Jul 2004, Simon Riggs wrote:
> 
> > I notice this:
> >
> > When a checkpoint occurs, if a log file is more than 75% full then a new
> > file will be allocated (in PreallocXlogFiles).
> >
> > This assumes we checkpoint at least 4 times per log file, otherwise it
> > will be effectively random whether we actually ever do this or not. With
> > an uneven or bursty workload, we would need to checkpoint many more
> > times per xlog to even notice this is ever being called. (I never have).
> >
> > ...but we don't check that anywhere in the code.
> 
> I prefer the idea of just checking it more often than pulling the code out
> all together. I think this sits well with Jan's work on consistent
> availability (buffer manager, vacuum delay).
> 

Good idea. Hey - we could get archiver to do this, seeing as it knows
when the logs are full. Just do: I've seen a full one, I'll prealloc
another. No test, just alloc. (Or the bgwriter...)

On Thu, 2004-07-22 at 00:53, Tom Lane wrote: 
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > I would like to remove PreallocXlogFiles on the basis that it is dead,
> > or at least pointless code.
> 
> I wonder whether we should not put back the preallocated-files GUC
> parameter that Bruce took out a release or two back.  PreallocXlogFiles
> made a lot more sense back when that parameter existed.

That's simplest, especially if the code is there.

But again, if you set it to a constant value it's not really responding
to system demands, its just the admin's guess of what to set it to.

Gavin's idea sounds more optimal...

However, I'm not at all convinced that this analysis holds up with
> bursty traffic or when the archiver is delaying rotation of old xlogs.
> If the number of physical WAL files needs to grow and shrink because
> of such effects, then PreallocXlogFiles is the only thing that can
> prevent foreground processes from having to do the work that should
> be handled by the checkpointer.

Yes, I agree, but the checkpointer isn't waking up often enough
currently to do this effectively. It's just randomly doing it.

Best regards, Simon Riggs


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] check point segments leakage ?

2004-07-21 Thread Rod Taylor
> What happens when a transaction fails to either commit or abort as a
> result of a serious error?
> 
> That looks like a transaction-in-progress doesn't it? 
> 
> Would that prevent VACUUM from doing its work? It should be able to
> check the last startup xid to check that isn't the case, but suppose a
> backend had exited without taking down the postmaster.

I don't know if this is the case now or not (I imagine it's pretty good
at cleaning up at the moment), but if we implemented 2 Phase Commit this
logic would need to be removed as transactions need to cross database
restarts.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Why we really need timelines *now* in PITR

2004-07-21 Thread Simon Riggs
On Wed, 2004-07-21 at 23:42, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > More verbosely (not numbered because they're not a sequence or
> > progression)
> 
> > - if no recovery.conf is present we do crash recovery to end of logs on
> > pg_control timeline (i.e. current) 
> 
> Check.
> 
> > - if recovery.conf is present and we do not specify a target we do
> > archive recovery to end of logs on pg_control timeline (i.e. current) 
> 
> I have done it this way for now, but I'm unconvinced whether this is the
> best default --- it might be that we'd be better off making 'latest' be
> the default.  The point here is that when you restore a tar backup,
> 'current' is going to become the timeline that was current when the
> backup was made, not the one that was current just before you wiped
> $PGDATA.  I'm not really sure which case is going to be more commonly
> wanted.

Right now, that sounds the best option. But my head hurts :)

> 
> > - if recovery.conf is present and we specify a target, but no timeline,
> > then we do archive recovery on the pg_control timeline only, and assume
> > that the target was supposed to be on this, even if we don't find it
> 
> Whether you specify a target stopping point does not matter AFAICS.  The
> timeline selection has to be made before we can even look at the data.
> 

Yes, I was describing a case where a default behaviour would be required
to make the timeline selection before the "desired" behaviour could be
enacted. 

> > - if recovery.conf is present and we specify a timeline of literally
> > 'latest' (without having to know what that is) - then we search archive
> > for the latest history file, then we do archive recovery from the
> > pg_control timeline to the latest timeline as specified in the latest
> > history file. If we specify a target, then this is searched for on
> > whatever timeline we're on as we rollforward.
> 
> Check.
> 
> > - if recovery.conf is present and we specify a timeline - then we search
> > archive for that history file, then we do archive recovery from the
> > pg_control timeline to the specified timeline as shown in that history
> > file. If we specify a target, then this is searched for on whatever
> > timeline we're on as we rollforward.
> 
> Check.
> 
> >>> I don't like the name target_in_timeline,
> >> 
> >> Agreed, but I don't have a better name offhand for it.
> 
> For lack of any better idea, I have swallowed my objections to "target"
> and called it "recovery_target_timeline".  We can easily rename the
> parameter if anyone comes up with something more compelling.
> 
> Above behavior is all committed to CVS as of a few minutes ago.
> 

...very cool.

OK, back to first principles as a cross-check then:

PITR should cope with these scenarios. These are described reasonably
closely but not as exact mechanical tests, so as to ensure that if
multiple solutions exist to these recovery scenarios that all paths are
tested.

These are written with a view to *rough* functionality of timelines,
rather than reading the above and making up cases to fit. I suggest we
see if these all work, see why not (if not) and make up some other cases
to make sure all possibilities are catered for.

1. We crash, and wish to recover, as per 7.4

2. We are running happily, using an automated standby database. The
first database fails irrecoverably and we are forced to switch to the
second system which recovers quickly to end of logs, though without the
partially full current xlog from the downed system.

3. We are running happily, but spot a rogue transaction that we wish to
expunge. We decide to run a PITR up to that txnid. We do an archive
recovery to a recovery_target_xid. We have available to us local copies
of the xlogs if required.

4. We perform (3), then after operating for an hour, we realise that
this was an extremely bad idea and decide to recover back to the point
BEFORE we started to recover the first time - i.e. to try to pretend we
had never attempted PITR in the first place because there was some even
more important data just recently committed we didn't know about.

5. We attempt (4) but fail because the then-current log, which has not
been archived, was deleted because we wouldn't need it anymore. We
decide that we made the right choice in the first place and decide to
re-run the PITR, though to a point slightly ahead of where we stopped
last time we tried that.

6. We are running a distributed system that does not properly support
two-phase commit in all of its persistent components. One of the other
components fails (of course not pg!) and we are forced to do a PITR to a
point in time that matches the best last known timestamp of all
persistent system components. We PITR to a recovery_target_time.

7. We have just done (6), but 10 minutes into production we realise that
the clocks between 2 of our systems were out by 3 seconds. Not much, but
it is causing serious errors to bang around the system. We decide to
re-run the prev

Re: [HACKERS] PreallocXlogFiles

2004-07-21 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> I would like to remove PreallocXlogFiles on the basis that it is dead,
> or at least pointless code.

It could stand improvement I'm sure, but it's not pointless,
particularly not when you have archive mode turned on and so dead xlog
segments can't necessarily be recycled immediately.  There's no
guarantee that there are very many segments available to be recycled
when a checkpoint happens, and so if you don't do some preallocation
you may find foreground processes forced to do the work instead when
they run out of forward xlog space.

If you assume a reasonably steady flow of xlog traffic and no
significant archiving delays, then you can see that the system settles
into a steady state where at each checkpoint about the same number of
old WAL files get rotated around to become forward xlog space, and
indeed there's little need for PreallocXlogFiles because MoveOfflineLogs
does all the heavy lifting.

However, I'm not at all convinced that this analysis holds up with
bursty traffic or when the archiver is delaying rotation of old xlogs.
If the number of physical WAL files needs to grow and shrink because
of such effects, then PreallocXlogFiles is the only thing that can
prevent foreground processes from having to do the work that should
be handled by the checkpointer.

I wonder whether we should not put back the preallocated-files GUC
parameter that Bruce took out a release or two back.  PreallocXlogFiles
made a lot more sense back when that parameter existed.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] PreallocXlogFiles

2004-07-21 Thread Gavin Sherry
On Wed, 21 Jul 2004, Simon Riggs wrote:

> I notice this:
>
> When a checkpoint occurs, if a log file is more than 75% full then a new
> file will be allocated (in PreallocXlogFiles).
>
> This assumes we checkpoint at least 4 times per log file, otherwise it
> will be effectively random whether we actually ever do this or not. With
> an uneven or bursty workload, we would need to checkpoint many more
> times per xlog to even notice this is ever being called. (I never have).
>
> ...but we don't check that anywhere in the code.

I prefer the idea of just checking it more often than pulling the code out
all together. I think this sits well with Jan's work on consistent
availability (buffer manager, vacuum delay).

The question is, where to call it from. Its possible that the buffer
manager may have enough information to guess how often a new checkpoint
file should be preallocated. The alternative would be to have (yet
another) backend look after this.

Or, maybe the autovacuum backend could look after this. It would have
access to stats which may be useful but it would mean that people would
have to run autovacuum if they wanted checkpoints preallocated.

Thanks,

Gavin

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Why we really need timelines *now* in PITR

2004-07-21 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> More verbosely (not numbered because they're not a sequence or
> progression)

> - if no recovery.conf is present we do crash recovery to end of logs on
> pg_control timeline (i.e. current) 

Check.

> - if recovery.conf is present and we do not specify a target we do
> archive recovery to end of logs on pg_control timeline (i.e. current) 

I have done it this way for now, but I'm unconvinced whether this is the
best default --- it might be that we'd be better off making 'latest' be
the default.  The point here is that when you restore a tar backup,
'current' is going to become the timeline that was current when the
backup was made, not the one that was current just before you wiped
$PGDATA.  I'm not really sure which case is going to be more commonly
wanted.

> - if recovery.conf is present and we specify a target, but no timeline,
> then we do archive recovery on the pg_control timeline only, and assume
> that the target was supposed to be on this, even if we don't find it

Whether you specify a target stopping point does not matter AFAICS.  The
timeline selection has to be made before we can even look at the data.

> - if recovery.conf is present and we specify a timeline of literally
> 'latest' (without having to know what that is) - then we search archive
> for the latest history file, then we do archive recovery from the
> pg_control timeline to the latest timeline as specified in the latest
> history file. If we specify a target, then this is searched for on
> whatever timeline we're on as we rollforward.

Check.

> - if recovery.conf is present and we specify a timeline - then we search
> archive for that history file, then we do archive recovery from the
> pg_control timeline to the specified timeline as shown in that history
> file. If we specify a target, then this is searched for on whatever
> timeline we're on as we rollforward.

Check.

>>> I don't like the name target_in_timeline,
>> 
>> Agreed, but I don't have a better name offhand for it.

For lack of any better idea, I have swallowed my objections to "target"
and called it "recovery_target_timeline".  We can easily rename the
parameter if anyone comes up with something more compelling.

Above behavior is all committed to CVS as of a few minutes ago.

> Another thing I note is that archive_status .ready messages are written
> for all restored xlog files (rather than .done messages).

I think this is gone now.  However, we still have the issue of preventing
re-archival of old, incomplete XLOG segments that might be brought back
into pg_xlog/ as a result of restoring a tar backup.  I don't have a
better solution to that than the one Bruce and I proposed yesterday
(make the DBA clean out pg_xlog before starting a recovery run).

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [JDBC] V3 protocol + DECLARE problems

2004-07-21 Thread Kris Jurka


On Wed, 21 Jul 2004, Oliver Jowett wrote:

> It's going to be fun using anything more than very basic cursors via the 
> V3 protocol in the JDBC driver. DECLARE does not work with parameters 
> passed via a Parse/Bind combination -- which is how we currently always 
> pass parameters when talking V3. I'm reluctant to have two 
> implementations of the parameter logic for V3 (one that does direct 
> substitution, one that uses Bind) since that's extra unnecessary code 
> and a recipe for inconsistent behaviour.

I see where you are going in the WITH HOLD case, I don't see how this 
extends to scrollable cursors without other major changes as the Execute 
protocol message assumes forward only operation.

Kris Jurka

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] parameter hints to the optimizer

2004-07-21 Thread Oliver Jowett
Merlin Moncure wrote:
Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.
When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
statement until parameters are received in the Bind message (which is 
essentially the same as what you describe). There was some discussion at 
the time about making it more flexible so you could apply it to arbitary 
statements, but that needed a protocol change so it didn't happen.

So the guts of the work are done -- we'd just need a way to trigger the 
behaviour on demand. It sounds less painful to add something to PREPARE 
than to change the V3 protocol at this stage.

-O
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] check point segments leakage ?

2004-07-21 Thread Simon Riggs
On Wed, 2004-07-21 at 18:54, Rod Taylor wrote:
> > I don't know why the 1st VACUUM FULL wasn't able to reclaim the same 
> > amount of space as the 2nd one, but I would guess that it wasn't able to 
> > get a lock on some table.  It could have been autovac if it was doing a 
> > vacuum at that moment, but it could have been something else too.
> 
> Or there was a long running transaction in the background. The oldest
> active transaction will place limits on what VACUUM can or cannot
> remove.
> 

What happens when a transaction fails to either commit or abort as a
result of a serious error?

That looks like a transaction-in-progress doesn't it? 

Would that prevent VACUUM from doing its work? It should be able to
check the last startup xid to check that isn't the case, but suppose a
backend had exited without taking down the postmaster.

(...waits for thunder...)

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [JDBC] V3 protocol + DECLARE problems

2004-07-21 Thread Oliver Jowett
Tom Lane wrote:
Any chance of getting this fixed for 7.5?

I'm up to my keister in PITR and nested-xact issues and won't have time
to look at it soon.  Do you want to take a look and see if you can find
where the ball is getting dropped?
Ok, I'll do that.
-O
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] parameter hints to the optimizer

2004-07-21 Thread Simon Riggs
On Wed, 2004-07-21 at 19:12, Merlin Moncure wrote:
> There is one problem with the optimizer that is a constant source of
> frustration.  When using prepared statements and functions (particularly
> where function parameters are passed to a query), the optimizer often
> fails to utilize an index inside a plan.  
> 
> This is a well known problem because the parameter values are not known
> at the time the plan is generated, making things difficult for the
> optimizer.
> 
> It would be nice if 'hint' or sample parameters could be used when
> creating the statement so the optimizer could use those values when
> generating the plan.  For example, the default parameter syntax of C++
> could be borrowed (assuming this doesn't break any syntax rules).
> 
> example:
> prepare my_statement prepare (character varying='abc')
> as select * from t where t.k = $1;
> 
> create function my_function(int4=1234) returns [...]
> 
> Another way to deal with the problem is to defer plan generation until
> the first plan execution and use the parameters from that execution.
> 
> Am I crazy?  Comments?

Crazy enough to suggest what other RDBMS do.

It's a common problem, since it defeats the use of histogram statistics
to determine specific cardinality rather than generic cardinality.

The answer is to follow what those others do, since programs will be
written to take advantage of those optimization quirks.

DB2 supports various modes for BIND: REOPT(ALWAYS), REOPT(ONCE),
REOPT(VARS) and REOPT(NONE) - which are then manifested in their
precompiler.

..back to you,

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] PreallocXlogFiles

2004-07-21 Thread Simon Riggs
I notice this:

When a checkpoint occurs, if a log file is more than 75% full then a new
file will be allocated (in PreallocXlogFiles). 

This assumes we checkpoint at least 4 times per log file, otherwise it
will be effectively random whether we actually ever do this or not. With
an uneven or bursty workload, we would need to checkpoint many more
times per xlog to even notice this is ever being called. (I never have).

...but we don't check that anywhere in the code. 

Since checkpoints now default to every 300 seconds, we are assuming that
a log file takes at least 20 minutes to fill with an even workload,
which is not the case on busy systems. On slow systems, who cares
whether we preallocate or not? Especially now that we have the bgwriter
to smooth the workload of backends.

The idea was to preallocate a file ahead of it being required...mostly
we just hit the endspot without having preallocated any log files, so
the preallocation thing is just a waste of time.

PreallocXlogFiles is only ever called during a normal Checkpoint or
after Recovery. In both cases, there will always be xlogs recycled and
so preallocation has already taken place (except in the trivial case of
the first few xlogs after an initdb).

I would like to remove PreallocXlogFiles on the basis that it is dead,
or at least pointless code.

Objections?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] parameter hints to the optimizer

2004-07-21 Thread Hans-Jürgen Schönig
Merlin Moncure wrote:
Merlin,
This will most likely never be accepted by the core team because it is
better to spend more time on fixing the planner than to invent some
non-standard.
As far as I know some databases support a syntax where hints can be
hidden in comments or something like that.
Meanwhile I think that hints are more of a burdon than a help.
Regards,
		Hans

I thought as much.  Still, the parameter problem is a huge headache.  Maybe if it was 
possible to use the statistics to gather a 'pseudovariable' to feed to the parameter 
based on some algorithm, the planner could give better results without exposing the 
planner inner workings to the user.
Thanks for the feedback.
Merlin

Using statistics is exactly what the planner does ...
So why should data coming from the planner being given back to the 
planner? Doesn't make sense.

For more information I highly recommend Tom's talk at Oreilly's some 
time ago. I think it is called "recent improvements in 7.4" ...

regards,
Hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/720/10 1234567 or +43/664/816 40 77
www.cybertec.at, www.postgresql.at, kernel.cybertec.at
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[HACKERS] miniscule compiler barf in pg_ctl.c

2004-07-21 Thread Dann Corbit
Title: Message



I have seen this 
before with other compilers.  The latest MINGW GCC compiler does not like a 
goto label with no statements following.
pg_ctl.c: In 
function `pgwin32_ServiceMain':pg_ctl.c:991: error: label at end of compound 
statementmake[3]: *** [pg_ctl.o] Error 1make[3]: Leaving directory 
`/u/postgresql-snapshot/src/bin/pg_ctl'make[2]: *** [all] Error 
2make[2]: Leaving directory `/u/postgresql-snapshot/src/bin'make[1]: *** 
[all] Error 2make[1]: Leaving directory 
`/u/postgresql-snapshot/src'make: *** [all] Error 2
Fix is simple, add 
an empty statement:
 
 switch 
(ret) {  case WAIT_OBJECT_0: /* shutdown event 
*/   kill(postmasterPID,SIGINT);   WaitForSingleObject(postmasterProcess,INFINITE);   break;
 
  case 
(WAIT_OBJECT_0+1): /* postmaster went down 
*/   break;
 
  default:    ;  
/* << Just a semicolon added here 
 */   /* 
assert(false); */ }


[HACKERS] Missing header in zic.c?

2004-07-21 Thread Dann Corbit
Title: Message



Perhaps it is a 
Win32 only issue.
 
This is from today's 
snapshot.
 
When compiling 
zic.c, the definition for optarg was missing (obviously getopt.h was 
missing).  Adding getopt.h to zic.c solved that 
problem.
 
Perhaps it is a 
combination of compiler flags that skirts the inclusion.  I performed a 
make clean and then a make using the MINGW tool set.
 


[HACKERS] parameter hints to the optimizer

2004-07-21 Thread Merlin Moncure
There is one problem with the optimizer that is a constant source of
frustration.  When using prepared statements and functions (particularly
where function parameters are passed to a query), the optimizer often
fails to utilize an index inside a plan.  

This is a well known problem because the parameter values are not known
at the time the plan is generated, making things difficult for the
optimizer.

It would be nice if 'hint' or sample parameters could be used when
creating the statement so the optimizer could use those values when
generating the plan.  For example, the default parameter syntax of C++
could be borrowed (assuming this doesn't break any syntax rules).

example:
prepare my_statement prepare (character varying='abc')
as select * from t where t.k = $1;

create function my_function(int4=1234) returns [...]

Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.

Am I crazy?  Comments?
Merlin


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] check point segments leakage ?

2004-07-21 Thread Rod Taylor
> I don't know why the 1st VACUUM FULL wasn't able to reclaim the same 
> amount of space as the 2nd one, but I would guess that it wasn't able to 
> get a lock on some table.  It could have been autovac if it was doing a 
> vacuum at that moment, but it could have been something else too.

Or there was a long running transaction in the background. The oldest
active transaction will place limits on what VACUUM can or cannot
remove.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] check point segments leakage ?

2004-07-21 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Matthew T. O'Connor wrote:
| Gaetano Mendola wrote:
|
|> I'm pretty sure, see the attached graph. Each morning at 7 a script stop
|> the autovacuum, vacuum full the database and reindex the eavy updated
|> tables
|> and restart of course the autovacuum. Note also that for all the day I
|> didn't
|> have the usual disk usage increment.
|
|
| I don't know why the 1st VACUUM FULL wasn't able to reclaim the same
| amount of space as the 2nd one, but I would guess that it wasn't able to
| get a lock on some table.  It could have been autovac if it was doing a
| vacuum at that moment, but it could have been something else too.
|
|  From the attached graph, it looks like your stead state database size
| is approx 3.0G.  After the 2nd VACUUM FULL, you dropped to 2.5G, but as
| you can see it's creeping up back up again.
|
| If you let it continue to run without running VACUUM FULL, but with
| autovacuum enabled, and it climbs to 3.0G and stops growing, then I
| think you are fine and you don't need to run VACUUM FULL at all.  If it
| continues to grop witout bound, then you need to up your FSM and/or make
| autovac more aggressive.
|
| Bottom line, you shouldn't need VACUUM FULL, if you do, I think there
| are people on this list that would like to hear about it.
I will try to disable ( I hope the management is not reading this list )
the vacuum full performed each morning, I'll leave only the reindex for
a couple of table and I'll see what happen I will post another graph
Regards
Gaetano Mendola

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFA/p247UpzwH2SGd4RAokEAJ9+xhF9g8ZbzE3ne6qCFOuV6z3LmACg9yQR
hL7LaOX8EucswifK5okQZ9g=
=jKG9
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] check point segments leakage ?

2004-07-21 Thread Matthew T. O'Connor
Gaetano Mendola wrote:
I'm pretty sure, see the attached graph. Each morning at 7 a script stop
the autovacuum, vacuum full the database and reindex the eavy updated 
tables
and restart of course the autovacuum. Note also that for all the day I 
didn't
have the usual disk usage increment.
I don't know why the 1st VACUUM FULL wasn't able to reclaim the same 
amount of space as the 2nd one, but I would guess that it wasn't able to 
get a lock on some table.  It could have been autovac if it was doing a 
vacuum at that moment, but it could have been something else too.

From the attached graph, it looks like your stead state database size 
is approx 3.0G.  After the 2nd VACUUM FULL, you dropped to 2.5G, but as 
you can see it's creeping up back up again.

If you let it continue to run without running VACUUM FULL, but with 
autovacuum enabled, and it climbs to 3.0G and stops growing, then I 
think you are fine and you don't need to run VACUUM FULL at all.  If it 
continues to grop witout bound, then you need to up your FSM and/or make 
autovac more aggressive.

Bottom line, you shouldn't need VACUUM FULL, if you do, I think there 
are people on this list that would like to hear about it.

Matthew
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] text and varchar

2004-07-21 Thread Dennis Bjorklund
Why do we have both the type text and type varchar (without limit)?  
Couldn't we make one to be an alias for the other?

Since it's 2 distinct types there are some strange effects:

  dennis=> SELECT CAST ('123'::varchar AS integer);
  ERROR:  cannot cast type character varying to integer

  dennis=> SELECT CAST ('123'::text AS integer);
   int4
  --
123

Sure, the cast is simple to fix. But why do we have two types in the first 
place?

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] [JDBC] V3 protocol + DECLARE problems

2004-07-21 Thread Tom Lane
Oliver Jowett <[EMAIL PROTECTED]> writes:
> Logs follow; basically this is issuing a Parse/Bind/Execute for a 
> parameterized DECLARE, which blows up with "no value found for parameter 
> 1" despite there definitely being one there. (also, that error appears 
> on Execute, not Parse/Bind).

This may be fixable --- I'm thinking that the problem has to do with not
passing down parameters from one portal to another in a situation where
they probably should be.

> Any chance of getting this fixed for 7.5?

I'm up to my keister in PITR and nested-xact issues and won't have time
to look at it soon.  Do you want to take a look and see if you can find
where the ball is getting dropped?

> Alternatively, if we can get WITH HOLD / SCROLL behaviour in portals
> created by Execute (probably means a protocol change) that works too.

This won't be salable unless we decide we really need a protocol change
to support nested xacts properly.  With the change to SAVEPOINT syntax
the motivation for that may have diminished --- in particular I doubt
that we can usefully report a nesting depth, so the urge to fool with
the Z message has diminished greatly (in my mind anyway).

I'd suggest looking into the parameter issue if you want to have
confidence in a fix being available in 7.5.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] check point segments leakage ?

2004-07-21 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Matthew T. O'Connor wrote:
| Gaetano Mendola wrote:
|
|> Well, today I stop the pg_autovacuum and I did a vacuum full and I
|> reindexed
|> all big tables and other 500 MB were reclamed. Could be the pg_autovacuum
|> running yesterday the responsible for these 500MB not reclamed during
|> a vacuum full and reindex already performed yesterday ?
|
|
| Probably not. Most of the time pg_autovacuum is just sleeping.  If you
| happened to fun a VACUUM FULL while pg_autovacuum was running a vacuum,
| there might have been a conflict on the tabke pg_autovacuum was working
| with at the time.
|
| Also, are you sure that the space wasn't reclaimed yesterday after the
| VACUUM FULL?  It could be that your tables have grown 500M since then.
| Remember, the minimum table size (the size after a VACUUM FULL) is not
| necessarilly the optimial size.  Postgresql will almost always need to
| reallocate the space that was reclaimed by VACUUM FULL.
I'm pretty sure, see the attached graph. Each morning at 7 a script stop
the autovacuum, vacuum full the database and reindex the eavy updated tables
and restart of course the autovacuum. Note also that for all the day I didn't
have the usual disk usage increment.
|> I'm wandering if will be possible in the 7.5 start and stop the the
|> autovacuum integrated in the backend.
|
|
| Yes (at least the patch waiting to be applied to CVS HEAD does) in order
| to stop autovacuum you will have to edit the autovac option in
| postgresql.conf and HUP the postmaster.
This is a good news.
Regards
Gaetano Mendola


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFA/o2Z7UpzwH2SGd4RAi38AKCO7XqClR/+X5b8szVJwbREC50HrQCg5M8n
R5ODgRU05IGnnS1YaK4afIk=
=ftFY
-END PGP SIGNATURE-
<>
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] check point segments leakage ?

2004-07-21 Thread Matthew T. O'Connor
Gaetano Mendola wrote:
Well, today I stop the pg_autovacuum and I did a vacuum full and I 
reindexed
all big tables and other 500 MB were reclamed. Could be the pg_autovacuum
running yesterday the responsible for these 500MB not reclamed during
a vacuum full and reindex already performed yesterday ?
Probably not. Most of the time pg_autovacuum is just sleeping.  If you 
happened to fun a VACUUM FULL while pg_autovacuum was running a vacuum, 
there might have been a conflict on the tabke pg_autovacuum was working 
with at the time.

Also, are you sure that the space wasn't reclaimed yesterday after the 
VACUUM FULL?  It could be that your tables have grown 500M since then. 
Remember, the minimum table size (the size after a VACUUM FULL) is not 
necessarilly the optimial size.  Postgresql will almost always need to 
reallocate the space that was reclaimed by VACUUM FULL.

I'm wandering if will be possible in the 7.5 start and stop the the
autovacuum integrated in the backend.
Yes (at least the patch waiting to be applied to CVS HEAD does) in order 
to stop autovacuum you will have to edit the autovac option in 
postgresql.conf and HUP the postmaster.

Matthew
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] unused variable

2004-07-21 Thread Bruce Momjian

Fixed in CVS.

---

Gaetano Mendola wrote:
> Hi all,
> 
> I'm compiling the last postgres CVS version and I get:
> 
> vacuum.c: In function `repair_frag':
> vacuum.c:1528: warning: unused variable `myXID'
> 
> 
> 
> Regards
> Gaetano Mendola
> 
> 
> 
> 
> 
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] check point segments leakage ?

2004-07-21 Thread Gaetano Mendola
Bruce Momjian wrote:
> Gaetano Mendola wrote:
>
>>Bruce Momjian wrote:
>>
>>
>>>Scott Marlowe wrote:
>>>
>>>
>>I use a checkpoint_segments = 16 but in my pg_xlog I have
>>35 files. Why 35 files ?
>>>
>>>
>>>You have 35 because the max files in pg_xlog is 2*checkpoint_segments +1
>>>or something like that. This is documented in the SGML.
>>
>>Ok, that explain why. And they will remain there also if not needed ?
>
>
> Yes, it keeps them around so it doesn't need to recreate them.
>
>
>>Another weird behaviour is that during the day the storage space usage
>>increase gruadualy. Since today as the graph show the space usage
>>is constant, it's like if some space was pre-allocated and now is
>>used, see same yesterday period between 18:00 and 24:00.
>>Toughts ?
>
>
> My guess is that you need a certain amount of free space in the tables
> to operate properly.
Well, today I stop the pg_autovacuum and I did a vacuum full and I reindexed
all big tables and other 500 MB were reclamed. Could be the pg_autovacuum
running yesterday the responsible for these 500MB not reclamed during
a vacuum full and reindex already performed yesterday ?
I'm wandering if will be possible in the 7.5 start and stop the the
autovacuum integrated in the backend.
I don't know if there is space for improvements but add columns to a table
with milion rows is very painfull, for sure could be usefull to do the
following tree operation in one shot:
1) Add column
2) Update the column
3) Set not null

Regards
Gaetano Mendola

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]