date:20071011

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Simon Riggs

On Thu, 2007-10-11 at 17:46 -0400, Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Kevin Grittner wrote:
> >> If the goal is to provide fair warning of a high-than-usual-risk
> >> release, you've got it covered.
> 
> > No, that was not the intent. The indent was to say we got a lot done in
> > one year.  You have a suggestion?
> 
> Yeah: take the entire paragraph out again.  I concur with Neil that
> it's nothing but marketing fluff, and inaccurate fluff at that.

I think it is important that we are up beat about what we have achieved,
but perhaps we should avoid opinions in the release notes.

Perhaps it should be part of the press release?

We should say that the release notes get discussed on -hackers and the
press releases get discussed on -advocacy. That way the scope is clear
and we can keep the marketing at arms length from the engineering. I
think we need both, but in appropriate places.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Including Snapshot Info with Indexes

2007-10-11 Thread Gokulakannan Somasundaram

Hi All,
 Last two mails were sent by mistake without completion. I couldn't
curse my system any further
 I apologize for that.

 If we comeback to the topic of discussion

So i think we are clear now, that it is possible to have an index with
snapshot info.  Let me try to enumerate the uses of having the Index with
snapshot info, in comparison to the Dead Space Map.

a) Dead Space, if it is successfull in its implementation of what it claims,
will have the means to point out that all the tuples of certain chunks are
frozen for registered relations and registered chunks. There would be lot of
blocks which won't fall under this category.
  i) For example, if the records are newly inserted, that block
can't be marked as containing all frozen tuples.
 ii) Imagine the case where there is a batch job / Huge select
query running in a enterprise for more than 6hrs. All the blocks which have
got inserted into the tables, during this period might not be able to get
the advantage of DSM
iii) Imagine the case for which i am actually proposing the
Index with Snapshot infos. Partitioned tables. Every time a new table gets
created, it has to get registered into the Deadspace. This requires more
maintenance on the DBA Side
iv) I understand the DeadSpaceLock to be a Global lock(If one
transaction is updating the dead space for any unregistered chunk, no one
else can query the DeadSpace). If my statement is right, then partitioned
tables might not be able to benefit from DSM. We have to remember for tables
with daily partitions, this would prove to be a nightmare

Other than that there are lot of advantages, i foresee with including the
indexes with snapshots
i) Vacumming of these indexes need not be done with SuperExclusive Locks. It
is possible to design a strategy to vacuum these indexes with Exclusive
locks on pages
ii) The above would mean that index can be in operation while the vacuum is
happening
iii) As we have already stated, it provides a efficient clustering of
related data.
iv) The Reverse Mapping Index, if present provides an efficient solution to
the Retail Vacuum problem. So HOT can be improved further with no need to
place the restriction of the updated tuple should be in the same page
iv) Updates on tables with primary keys(where primary key is not updated),
will be able to resolve the unique constraint faster. This is a minor
advantage.

The complexity of Reverse Mapping index will only be there for user-defined
functional indexes.
These functions can be pruned further to find out the obvious immutable
ones.

I expect your valuable feedback for this.

Thanks,
Gokul.

On 10/12/07, Gokulakannan Somasundaram <[EMAIL PROTECTED]> wrote:
>
> Hi All,
>  Last mail was sent by mistake without completion. I apologize for
> that. i am continuing on that.
>
> So i think we are clear now, that it is possible to have an index with
> snapshot info.  Let me try to enumerate the uses of having the Index with
> snapshot info, in comparison to the Dead Space Map.
>
> a) Dead Space, if it is successfull in its implementation of what it
> claims, will have the means to point out that all the tuples of certain
> chunks are frozen for registered relations and registered chunks. There
> would be lot of blocks which won't fall under this category.
>   i) For example, if the records are newly inserted, that
> block can't be marked as containing all frozen tuples.
>  ii) Imagine the case where there is a batch job / Huge select
> query running in a enterprise for more than 6hrs. All the blocks which have
> got inserted into the tables, during this period might not be able to get
> the advantage of DSM
>
>
>
> On 10/11/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
> >
> > Gokulakannan Somasundaram wrote:
> > >  As explained, if we are going to include the snapshot with indexes,
> > Vacuum
> > > will be done on the index independent of the table, so Vacuum will not
> > > depend on immutability. We need to goto the index from the table, when
> > we
> > > want to update the snapshot info. The problem on hand is that some of
> > the
> > > userdefined functions are mutable, whereas the user might mark it
> > immutable.
> > >
> > > So my idea is to have a mapping index, with tupleid as the first
> > column and
> > > the function's values as subsequent columns. I have a somewhat
> > detailed
> > > design in mind. So there will be a over head of extra 3 I/Os for
> > > update/delete on indices based on User-defined functions. But this
> > setup
> > > will speed-up lot of queries where the tables are partitioned and
> > there will
> > > be more inserts and selects and dropping partitions at periodic
> > intervals.
> > > Updates become costly by 3 I/Os per Index with snapshot. So if someone
> > has
> > > more selects than updates+deletes then this index might come handy
> > (ofcourse
> > > not with user-defined functional indices).
>

Re: [HACKERS] First steps with 8.3 and autovacuum launcher

2007-10-11 Thread Simon Riggs

On Fri, 2007-10-12 at 07:17 +0100, Simon Riggs wrote:
> On Fri, 2007-10-12 at 01:24 -0400, Alvaro Herrera wrote:
> > Michael Paesold escribió:
> > > Simon Riggs wrote:
> > 
> > > Hmm, I am not sure we are there, yet. Autovacuum does take extra care to 
> > > vacuum tables nearing xid wrap-around, right? It even does so when 
> > > autovacuum is disabled in the configuration.
> > >
> > > So in case a vacuum is needed for that very reason, the vacuum should 
> > > *not* 
> > > be canceled, of course. So we don't really need the information, whether 
> > > the AV worker is doing VACUUM or ANALYZE, but whether it is critical 
> > > against xid wrap-around. Could that be done as easily as in Alvaro's 
> > > patch 
> > > for distinguishing vacuum/analyze? Alvaro?
> > 
> > Yes, I think it is easy to mark the "is for xid wraparound" bit in the
> > WorkerInfo struct and have the cancel work only if it's off.
> > 
> > However, what I think should happen is that the signal handler for
> > SIGINT in a worker for xid wraparound should not cancel the current
> > vacuum.  Instead turn it into a no-op, if possible.  That way we also
> > disallow a user from cancelling vacuums for xid wraparound.  I think he
> > can do that with pg_cancel_backend, and it could be dangerous.
> 
> I think that is dangerous too because the user may have specifically
> turned AV off. That anti-wraparound vacuum might spring up right in a
> busy period and start working its way through many tables, all of which
> cause massive writes to occur. That's about as close to us causing an
> outage as I ever want to see. We need a way through that to allow the
> user to realise his predicament and find a good time to VACUUM. I never
> want to say to anybody "nothing you can do, just sit and watch, your
> production system will be working again in no time. Restart? no that
> won't work either."

I think the best way to handle this is to have two limits.

First limit attempts to autovacuum, but can be cancelled.

When we hit second limit, sometime later, then autovacuum cannot be
cancelled.

That would give us a breathing space if we need it.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Including Snapshot Info with Indexes

2007-10-11 Thread Gokulakannan Somasundaram

Hi All,
 Last mail was sent by mistake without completion. I apologize for
that. i am continuing on that.

So i think we are clear now, that it is possible to have an index with
snapshot info.  Let me try to enumerate the uses of having the Index with
snapshot info, in comparison to the Dead Space Map.

a) Dead Space, if it is successfull in its implementation of what it claims,
will have the means to point out that all the tuples of certain chunks are
frozen for registered relations and registered chunks. There would be lot of
blocks which won't fall under this category.
  i) For example, if the records are newly inserted, that block
can't be marked as containing all frozen tuples.
 ii) Imagine the case where there is a batch job / Huge select
query running in a enterprise for more than 6hrs. All the blocks which have
got inserted into the tables, during this period might not be able to get
the advantage of DSM

On 10/11/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
>
> Gokulakannan Somasundaram wrote:
> >  As explained, if we are going to include the snapshot with indexes,
> Vacuum
> > will be done on the index independent of the table, so Vacuum will not
> > depend on immutability. We need to goto the index from the table, when
> we
> > want to update the snapshot info. The problem on hand is that some of
> the
> > userdefined functions are mutable, whereas the user might mark it
> immutable.
> >
> > So my idea is to have a mapping index, with tupleid as the first column
> and
> > the function's values as subsequent columns. I have a somewhat detailed
> > design in mind. So there will be a over head of extra 3 I/Os for
> > update/delete on indices based on User-defined functions. But this setup
> > will speed-up lot of queries where the tables are partitioned and there
> will
> > be more inserts and selects and dropping partitions at periodic
> intervals.
> > Updates become costly by 3 I/Os per Index with snapshot. So if someone
> has
> > more selects than updates+deletes then this index might come handy
> (ofcourse
> > not with user-defined functional indices).
>
> I think you need to explain why that is better than using the Dead Space
> Map. We're going to want the DSM anyway, to speed up VACUUMs; enabling
> index-only-scans just came as an afterthought. While DSM designed just
> for speeding up vacuums might look slightly different than one used for
> index-only scans, the infrastructure is roughly the same.
>
> What you're proposing sounds a lot more complex, less space-efficient,
> and slower to update. It requires extra action from the DBA, and it
> covers exactly the same use case (more selects than updates+deletes, to
> use your words). It would require changes to all index access methods,
> while the DSM would automatically work with all of them. In particular,
> including visibility information in a bitmap index, should we have
> bitmap indexes in the future, is impossible, while the DSM approach
> would just work.
>
> --
>   Heikki Linnakangas
>   EnterpriseDB   http://www.enterprisedb.com
>

Re: [HACKERS] Including Snapshot Info with Indexes

2007-10-11 Thread Gokulakannan Somasundaram

Hi All,
 So i think we are clear now, that it is possible to have an index
with snapshot info.  Let me try to enumerate the uses of having the Index
with snapshot info, in comparison to the Dead Space Map.

a) Dead Space, if it is successfull in its implementation of what it claims,
will have the means to point out that all the tuples of certain chunks are
frozen for registered relations and registered chunks. There would be lot of
blocks which won't fall under this category.
  i) For example, if the records are newly inserted, that block
can't be marked as containing all frozen tuples.


On 10/11/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
>
> Gokulakannan Somasundaram wrote:
> >  As explained, if we are going to include the snapshot with indexes,
> Vacuum
> > will be done on the index independent of the table, so Vacuum will not
> > depend on immutability. We need to goto the index from the table, when
> we
> > want to update the snapshot info. The problem on hand is that some of
> the
> > userdefined functions are mutable, whereas the user might mark it
> immutable.
> >
> > So my idea is to have a mapping index, with tupleid as the first column
> and
> > the function's values as subsequent columns. I have a somewhat detailed
> > design in mind. So there will be a over head of extra 3 I/Os for
> > update/delete on indices based on User-defined functions. But this setup
> > will speed-up lot of queries where the tables are partitioned and there
> will
> > be more inserts and selects and dropping partitions at periodic
> intervals.
> > Updates become costly by 3 I/Os per Index with snapshot. So if someone
> has
> > more selects than updates+deletes then this index might come handy
> (ofcourse
> > not with user-defined functional indices).
>
> I think you need to explain why that is better than using the Dead Space
> Map. We're going to want the DSM anyway, to speed up VACUUMs; enabling
> index-only-scans just came as an afterthought. While DSM designed just
> for speeding up vacuums might look slightly different than one used for
> index-only scans, the infrastructure is roughly the same.
>
> What you're proposing sounds a lot more complex, less space-efficient,
> and slower to update. It requires extra action from the DBA, and it
> covers exactly the same use case (more selects than updates+deletes, to
> use your words). It would require changes to all index access methods,
> while the DSM would automatically work with all of them. In particular,
> including visibility information in a bitmap index, should we have
> bitmap indexes in the future, is impossible, while the DSM approach
> would just work.
>
> --
>   Heikki Linnakangas
>   EnterpriseDB   http://www.enterprisedb.com
>

Re: [HACKERS] First steps with 8.3 and autovacuum launcher

2007-10-11 Thread Simon Riggs

On Fri, 2007-10-12 at 01:24 -0400, Alvaro Herrera wrote:
> Michael Paesold escribió:
> > Simon Riggs wrote:
> 
> > Hmm, I am not sure we are there, yet. Autovacuum does take extra care to 
> > vacuum tables nearing xid wrap-around, right? It even does so when 
> > autovacuum is disabled in the configuration.
> >
> > So in case a vacuum is needed for that very reason, the vacuum should *not* 
> > be canceled, of course. So we don't really need the information, whether 
> > the AV worker is doing VACUUM or ANALYZE, but whether it is critical 
> > against xid wrap-around. Could that be done as easily as in Alvaro's patch 
> > for distinguishing vacuum/analyze? Alvaro?
> 
> Yes, I think it is easy to mark the "is for xid wraparound" bit in the
> WorkerInfo struct and have the cancel work only if it's off.
> 
> However, what I think should happen is that the signal handler for
> SIGINT in a worker for xid wraparound should not cancel the current
> vacuum.  Instead turn it into a no-op, if possible.  That way we also
> disallow a user from cancelling vacuums for xid wraparound.  I think he
> can do that with pg_cancel_backend, and it could be dangerous.

I think that is dangerous too because the user may have specifically
turned AV off. That anti-wraparound vacuum might spring up right in a
busy period and start working its way through many tables, all of which
cause massive writes to occur. That's about as close to us causing an
outage as I ever want to see. We need a way through that to allow the
user to realise his predicament and find a good time to VACUUM. I never
want to say to anybody "nothing you can do, just sit and watch, your
production system will be working again in no time. Restart? no that
won't work either."

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] First steps with 8.3 and autovacuum launcher

2007-10-11 Thread Alvaro Herrera

Michael Paesold escribió:
> Simon Riggs wrote:

> Hmm, I am not sure we are there, yet. Autovacuum does take extra care to 
> vacuum tables nearing xid wrap-around, right? It even does so when 
> autovacuum is disabled in the configuration.
>
> So in case a vacuum is needed for that very reason, the vacuum should *not* 
> be canceled, of course. So we don't really need the information, whether 
> the AV worker is doing VACUUM or ANALYZE, but whether it is critical 
> against xid wrap-around. Could that be done as easily as in Alvaro's patch 
> for distinguishing vacuum/analyze? Alvaro?

Yes, I think it is easy to mark the "is for xid wraparound" bit in the
WorkerInfo struct and have the cancel work only if it's off.

However, what I think should happen is that the signal handler for
SIGINT in a worker for xid wraparound should not cancel the current
vacuum.  Instead turn it into a no-op, if possible.  That way we also
disallow a user from cancelling vacuums for xid wraparound.  I think he
can do that with pg_cancel_backend, and it could be dangerous.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Joshua D. Drake

On Thu, 11 Oct 2007 21:31:20 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:

> "Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> > With respect to you Kevin, your managers should wait. You don't

> 
> Now I realize that you did say "test" above, but way too often I see
> this sort of argument as a justification for doing nothing and
> expecting somebody else to fix it.

Tom, I get paid to deal with it, remember? I hardly want to do nothing
and expect someone else to fix it. I wouldn't get paid :) I just believe
that the sanest course of action with my customers data, is the
conservative course of action.

That means, testing, before, during and after release. It also means
unless there is something extremely pertinent (I have several customers
threatening to fly out and strangle me personally if I don't upgrade
them to 8.3 ASAP because of HOT), I won't upgrade them until I have
confidence in production deployment.

There are varying degrees of need. Some will need 8.3 as soon as
possible, but even those I would professionally suggest they put
against a test harness of production data. Otherwise they are looking
for trouble and they are may or may not find it.

When one has been doing this as long as I have, with as many postgresql
deployments as I have... you get gun shy and only upgrade when you
must. That means dot releases, or specific business value.

I have many, many customers that will be on 8.1 for a very, very long
time to come. I will concede that it is a very hard argument for
*anyone* to be running less than 8.1.

Sincerely,

Joshua D. Drake

> 
>   regards, tom lane
> 
> ---(end of
> broadcast)--- TIP 4: Have you searched our
> list archives?
> 
>http://archives.postgresql.org
> 

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

signature.asc
Description: PGP signature

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Tom Lane

"Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> With respect to you Kevin, your managers should wait. You don't
> install .0 releases of "any" software into production without "months"
> of testing. At which point, normally a .1 release has come out anyway.

How exactly do you expect the software to get from a .0 to a .1 release,
or to have addressed the bugs that might bite you when it does get to .1,
if you aren't helping to test it?

Now I realize that you did say "test" above, but way too often I see
this sort of argument as a justification for doing nothing and expecting
somebody else to fix it.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Tom Lane

"Trevor Talbot" <[EMAIL PROTECTED]> writes:
> I don't know if there have ever been retroactive changes to DST laws
> we could look at, but I could easily see a change like that affecting
> some things and not others.

Even a politician would hardly be silly enough to make a retroactive
DST law change.  On the other hand, it is *entirely* possible for a
computer system's DST rule files to be updated only after the effective
date of a law, and I think that's the case you'd want to design for.

Maybe, when you look at it that way, the past and future cases aren't so
different after all ...

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Michael Glaesemann



On Oct 11, 2007, at 18:51 , Joshua D. Drake wrote:


With respect to you Kevin, your managers should wait. You don't
install .0 releases of "any" software into production without "months"
of testing. At which point, normally a .1 release has come out anyway.


At the same time, an open source project such as PostgreSQL provides  
advantages here, in that preliminary testing can be performed during  
the development of the release, verified, of course, after the  
release has been made.


Michael Glaesemann
grzm seespotcode net




PGP.sig
Description: This is a digitally signed message part

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Joshua D. Drake

On Thu, 11 Oct 2007 15:26:43 -0500
"Kevin Grittner" <[EMAIL PROTECTED]> wrote:

> > This release represents a major leap forward by adding significant
> > new functionality and performance enhancements to
> > PostgreSQL. Many complex ideas that normally take
> > years to implement were added rapidly to this release by our
> > development team.
>  
> You do realize that this will make many managers very reluctant to
> adopt it before it has settled in for many months, right?

With respect to you Kevin, your managers should wait. You don't
install .0 releases of "any" software into production without "months"
of testing. At which point, normally a .1 release has come out anyway.

Sincerely,

Joshua D. Drake

>  
> If the goal is to provide fair warning of a high-than-usual-risk
> release, you've got it covered.
>  
> -Kevin
>  
> 
> 
> 
> ---(end of
> broadcast)--- TIP 7: You can help support the
> PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate
> 

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

signature.asc
Description: PGP signature

Re: [HACKERS] Some questions about mammoth replication

2007-10-11 Thread Joshua D. Drake

On Thu, 11 Oct 2007 21:58:45 +0300
Hannu Krosing <[EMAIL PROTECTED]> wrote:

> > We have hooks in executor calling our own collecting functions, so
> > we don't need the trigger machinery to launch replication.
> 
> But where do you store the collected info - in your own
> replication_log table,

No, we have our own transaction log outside the database.

> or do reuse data in WAL you extract it on
> master befor replication to slave (or on slave after moving the WAL) ?
>

We currently make zero use of WAL except to verify that the replication
data can be written.

> > > Do you make use of snapshot data, to make sure, what parts of WAL
> > > log are worth migrating to slaves , or do you just apply
> > > everything in WAL in separate transactions and abort if you find
> > > out that original transaction aborted ?
> > 
> > We check if a data transaction is recorded in WAL before sending
> > it to a slave. For an aborted transaction we just discard all data
> > collected from that transaction.
> 
> Do you duplicate postgresql's MVCC code for that, or will this happen
> automatically via using MVCC itself for collected data ?
> 

Gonna have to wait for Alexey for the rest.

Joshua D. Drake

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

signature.asc
Description: PGP signature

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Trevor Talbot

On 10/11/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Trevor Talbot" <[EMAIL PROTECTED]> writes:
> > Neither is the birth certificate.  The recorded, legal time of the
> > birth is the one that was written down.  If it doesn't happen to match
> > an international notion of current time, that's unfortunate, but it's
> > not subject to arbitrary changes later.  Even if it does match, it
> > still belongs to a specific time zone.  That's the key semantic point:
> > regurgitating that time as anything other than exactly what it was
> > entered as is simply not correct.
>
> I'm not convinced about that.  One consideration I think you are failing
> to account for is that there is a big difference between past and future
> times, at least in terms of what is likely to be the meaning of a
> change.  The above reasoning might apply to a past time but I think it's
> bogus for a future time.  If the TZ offset for a future time changes,
> it's likely because of a DST law change, and we are in Peter's
> what-time-is-the-appointment scenario.  A TZ offset for a past time
> probably should not change, but if it does, it suggests a retroactive
> data correction.  Surely you don't intend to prevent people from fixing
> bad data?

No, but I am mixing some different issues together.  The original
question of this thread is what happens when the zone rules change for
an already-entered time.  I contend the answer to that is a symptom of
the semantics of how it's treated, which boil down to whether a value
is stable relative to a specific zone, or to UTC.  Other symptoms
include whether it accurately transports, can be retrieved in the same
form it was entered in, etc.

So the birth certificate argument is for past times, unlikely to have
zone rules change, but does need to be tagged with a specific time
zone so that it can be returned exactly the same way.

The appointment argument is for future times, more likely to have zone
rules change, and still needs to be tagged with a specific time zone.
That includes transport, which implies that it should never be exposed
in any other form.

Same semantics really, it's just that one problem is less likely to
happen in one of those situations.

If something like a birth date is found to be incorrect, it would have
to be corrected through official methods, which means some human
involvement.  The only reasonable thing a database can do is keep it
exactly the same as entered until explicitly told otherwise; changing
it automatically is equivalent to corruption.

If the database is using zone rules that are out of date, and the
stamps are stored as local value and zone, only dynamic calculations
are affected.  When the zone rules are updated, not changing the data
is always the correct approach.

I don't know if there have ever been retroactive changes to DST laws
we could look at, but I could easily see a change like that affecting
some things and not others.  Individual organizations make their own
calls, state entities make varying decisions after gigantic reviews,
etc.  It would not surprise me at all to see yearly permits
retroactively change, lifetime certificates stay the same because they
don't want to reprint stuff, except the modern computerized department
that doesn't need to reprint much of anything, etc.  The correct
result is subjective, but since it's still a human call, you want to
default to not mangling the data.

People shouldn't be prevented from fixing bad data, but I don't see
how the database can possibly determine it *is* bad.  It seems similar
to the server's clock being off while it's inserting data with NOW;
there's just nothing you can do to automatically repair that after you
fix the clock.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Tom Lane

"Trevor Talbot" <[EMAIL PROTECTED]> writes:
> Neither is the birth certificate.  The recorded, legal time of the
> birth is the one that was written down.  If it doesn't happen to match
> an international notion of current time, that's unfortunate, but it's
> not subject to arbitrary changes later.  Even if it does match, it
> still belongs to a specific time zone.  That's the key semantic point:
> regurgitating that time as anything other than exactly what it was
> entered as is simply not correct.

I'm not convinced about that.  One consideration I think you are failing
to account for is that there is a big difference between past and future
times, at least in terms of what is likely to be the meaning of a
change.  The above reasoning might apply to a past time but I think it's
bogus for a future time.  If the TZ offset for a future time changes,
it's likely because of a DST law change, and we are in Peter's
what-time-is-the-appointment scenario.  A TZ offset for a past time
probably should not change, but if it does, it suggests a retroactive
data correction.  Surely you don't intend to prevent people from fixing
bad data?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Kevin Grittner

>>> On Thu, Oct 11, 2007 at  3:34 PM, in message
<[EMAIL PROTECTED]>, Bruce Momjian <[EMAIL PROTECTED]> wrote:

> The indent was to say we got a lot done in
> one year.  You have a suggestion?

My suggestion would be to stay away from statements about the speed
of development and focus on the user benefits of the release.

Before my previous post I asked a manager to read the statement and
let me know what he thought, and he said that it "sounds great if it
works" but that it sounded like something was being rushed into
production, which in his experience always meant a lot of bugs.  I
think everyone who contributed to these major improvements deserves
to be proud of their productivity, but it's hard to talk about that
in public without generating the wrong impression.

Come to think of it, a statements about high productivity, valuable
contributions, and community effort all spin OK.  I would stay away
from "heroic effort," though, as it is used in a negative way in
some "agile programming" documents.

Again, above all, focus on answering the question:

"What benefit do I get from moving to this release?"

-Kevin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Kevin Grittner wrote:
>> If the goal is to provide fair warning of a high-than-usual-risk
>> release, you've got it covered.

> No, that was not the intent. The indent was to say we got a lot done in
> one year.  You have a suggestion?

Yeah: take the entire paragraph out again.  I concur with Neil that
it's nothing but marketing fluff, and inaccurate fluff at that.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Trevor Talbot

On 10/11/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
> "Trevor Talbot" <[EMAIL PROTECTED]> writes:

> > While I agree that UTC storage is definitely a needed option, I was
> > trying to point out in the scenario above that sometimes an event
> > recorded at a specific moment in time *is* local time.  Birth
> > certificates aren't in UTC.  Usually there's no practical difference,
> > but there can be a semantic difference.
>
> Thinking of it as UTC is the wrong way to think about it. A birth occurred at
> a specific moment in time. You want to record that precise moment, not what it
> happened to show on the clock at the time. If the clock turns out to have been
> in the wrong timezone the birth isn't going to move.

Neither is the birth certificate.  The recorded, legal time of the
birth is the one that was written down.  If it doesn't happen to match
an international notion of current time, that's unfortunate, but it's
not subject to arbitrary changes later.  Even if it does match, it
still belongs to a specific time zone.  That's the key semantic point:
regurgitating that time as anything other than exactly what it was
entered as is simply not correct.

Birth dates enter common usage with the time zone stripped.  Your
birthday doesn't change when you move across a date line, despite the
fact that it's tied to the zone you were born in.

And yet it's an observed and recorded event, not a predicted appointment.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Andrew Hammond

On 10/11/07, Bruce Momjian <[EMAIL PROTECTED]> wrote:
>
> Kevin Grittner wrote:
> > >>> On Thu, Oct 11, 2007 at  3:04 PM, in message
> > <[EMAIL PROTECTED]>, Bruce Momjian <[EMAIL PROTECTED]>
> wrote:
> >
> > > This release represents a major leap forward by adding significant new
> > > functionality and performance enhancements to
> > > PostgreSQL. Many complex ideas that normally take
> years
> > > to implement were added rapidly to this release by our development
> team.
> >
> > You do realize that this will make many managers very reluctant to adopt
> > it before it has settled in for many months, right?
> >
> > If the goal is to provide fair warning of a high-than-usual-risk
> > release, you've got it covered.
>
> No, that was not the intent. The indent was to say we got a lot done in
> one year.  You have a suggestion?
>

Well, a number of these were bumped from 8.2, so it might be a good idea to
go with something like "complex improvements long under development have
come to fruition". For the reason suggested above, I don't think it's a
great idea to try to emphasize the impressive speed with which some of these
features have actually been implemented. I don't know that there's any
credible way to tell people that although these things were done quickly
they were also done with the exceptional care and attention to detail for
which the PostgreSQL development community is famous.

I really like your wording about how we're going beyond feature parity.
That's exactly the kind of stance for which the "World's Most Advanced Open
Source Database" ought to be aiming. As long as we can avoid the negative
connotations associated with "experimental".

Andrew

Re: [HACKERS] [PATCHES] Eliminate more detoast copies for packed varlenas

2007-10-11 Thread Tom Lane

I wrote:
> In fact, it seems there's a different risk here: if such a datum were
> toasted out-of-line, the reference in a cached plan might live longer
> than the underlying toast-table data.  Maybe we need a forcible detoast
> in evaluate_function().

Sure enough, it seems we do.  The attached script fails in every release
back to 7.3.  It's a rather contrived case, because a function marked
immutable probably shouldn't be reading from a table at all, especially
not one you are likely to change or drop.  That's probably why we've not
heard reports of this before.  But it's definitely a bug.

regards, tom lane

create table big(f1 text);
insert into big values(repeat('xyzzy',10));

create or replace function getbig() returns text as
'select f1 from big' language sql immutable;

create or replace function usebig(text) returns bool as '
begin return $1 ~ ''xyzzy''; end
' language plpgsql;

prepare foo as select usebig(getbig()) from int4_tbl;

execute foo;

drop table big;

execute foo;

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Neil Conway

On Thu, 2007-10-11 at 16:04 -0400, Bruce Momjian wrote:
> I have added the following introductory paragraph to the release notes:
> 
>   This release represents a major leap forward by adding significant new
>   functionality and performance enhancements to
>   PostgreSQL. Many complex ideas that normally take years
>   to implement were added rapidly to this release by our development team.
>   This release starts PostgreSQL on a trajectory moving
>   beyond feature-parity with other database systems to a time when
>   PostgreSQL will be a technology leader in the database
>   field.

Frankly, this sounds like empty hyperbole to me. There is a LOT of work
left to do before we reach feature parity with the major players, let
alone become a "technology leader in the database field". I would
personally vote for just saying that this release brings with it a lot
of useful new features and performance improvements, and leave it up to
the reader to decide whether we're on a "trajectory moving beyond
feature-parity".

If you want to compare where we are with the major players, I think it
would be more accurate to say that we're doing fairly well on OLTP
oriented features, but there is a lot of work left on OLAP
functionality.

-Neil

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread D'Arcy J.M. Cain

On Thu, 11 Oct 2007 16:34:14 -0400 (EDT)
Bruce Momjian <[EMAIL PROTECTED]> wrote:
> Kevin Grittner wrote:
> > > PostgreSQL. Many complex ideas that normally take years
> > > to implement were added rapidly to this release by our development team.
> >  
> > You do realize that this will make many managers very reluctant to adopt
> > it before it has settled in for many months, right?
> >  
> > If the goal is to provide fair warning of a high-than-usual-risk
> > release, you've got it covered.
> 
> No, that was not the intent. The indent was to say we got a lot done in
> one year.  You have a suggestion?

What if you changed "were added rapidly" to "were quickly brought to
maturity" or something like that?

-- 
D'Arcy J.M. Cain <[EMAIL PROTECTED]> |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Bruce Momjian

Kevin Grittner wrote:
> >>> On Thu, Oct 11, 2007 at  3:04 PM, in message
> <[EMAIL PROTECTED]>, Bruce Momjian <[EMAIL PROTECTED]> wrote:
>  
> > This release represents a major leap forward by adding significant new
> > functionality and performance enhancements to
> > PostgreSQL. Many complex ideas that normally take years
> > to implement were added rapidly to this release by our development team.
>  
> You do realize that this will make many managers very reluctant to adopt
> it before it has settled in for many months, right?
>  
> If the goal is to provide fair warning of a high-than-usual-risk
> release, you've got it covered.

No, that was not the intent. The indent was to say we got a lot done in
one year.  You have a suggestion?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Release notes introductory text

2007-10-11 Thread Kevin Grittner

>>> On Thu, Oct 11, 2007 at  3:04 PM, in message
<[EMAIL PROTECTED]>, Bruce Momjian <[EMAIL PROTECTED]> wrote:
 
> This release represents a major leap forward by adding significant new
> functionality and performance enhancements to
> PostgreSQL. Many complex ideas that normally take years
> to implement were added rapidly to this release by our development team.
 
You do realize that this will make many managers very reluctant to adopt
it before it has settled in for many months, right?
 
If the goal is to provide fair warning of a high-than-usual-risk
release, you've got it covered.
 
-Kevin
 



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] First steps with 8.3 and autovacuum launcher

2007-10-11 Thread Simon Riggs

On Thu, 2007-10-11 at 21:59 +0200, Michael Paesold wrote:

> So in case a vacuum is needed for that very reason, the vacuum should *not* 
> be canceled, of course. So we don't really need the information, whether 
> the AV worker is doing VACUUM or ANALYZE, but whether it is critical 
> against xid wrap-around. Could that be done as easily as in Alvaro's patch 
> for distinguishing vacuum/analyze? Alvaro?

Well, I did think about this.

We probably want to preserve the ability of an autovacuum to be manually
cancelled. So the only way to do this is by letting the would-be
canceller know that they shouldn't cancel that one by marking the
autovacuum to show it is a "compulsory" one. We could change the field
on PGPROC from a boolean isAutovacuum to a status flag, so we have bit
flags for IS_AUTOVACUUM and IS_WRAPAROUND_AVOIDANCE. I think that's
overkill personally, but you might argue me round.

> The other thing I am wondering about is, whether it would be a safer 
> approach to let the DBA decide whether to cancel AV vacuums or just disable 
> cost-delay, as Heikki suggested. There might be valid work-loads for both 
> options...

Cancelling the VACUUM hurts nobody, and allows the DDL to be run now,
not later when the database server gets round to it. Speeding up a
delayed vacuum will hurt everybody. A big VACUUM can last hours, even at
full speed and that is a big beast to let loose during prime time.

BTW I took the liberty of starting a new thread on this.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] Release notes introductory text

2007-10-11 Thread Bruce Momjian

[ BCC to docs and hackers.  Sorry this seems like the only logical way
to do this.]

I have added the following introductory paragraph to the release notes:

This release represents a major leap forward by adding significant new
functionality and performance enhancements to
PostgreSQL. Many complex ideas that normally take years
to implement were added rapidly to this release by our development team.
This release starts PostgreSQL on a trajectory moving
beyond feature-parity with other database systems to a time when
PostgreSQL will be a technology leader in the database
field. Major new capabilities in this release include:

The full release text with my edits to "major" and "migration" sections
is included:

http://momjian.us/main/writings/pgsql/sgml/release-8-3.html

Comments?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [PATCHES] Eliminate more detoast copies for packed varlenas

2007-10-11 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
> For the record I've been doing some more testing and found one place that
> could be a problem down the road. I'm not sure why it didn't show up
> previously. In selfuncs.c we use VARDATA/VARSIZE on data that is taken from
> parser Const nodes and from the histogram arrays without detoasting them.

> Currently this is safe as array elements are not packed and parser nodes
> contain values read using textin and never stored in a tuple. But down the
> road I expect we'll want to pack array element so this code would need to
> detoast the elements or prepare to handle packed elements.

Hmmm ... I think this should be fixed now, actually.  I'm far from
convinced that a Const could never contain a toasted datum.  Consider
constant-folding in the planner --- it just stuffs the result of a
function into a Const node.

In fact, it seems there's a different risk here: if such a datum were
toasted out-of-line, the reference in a cached plan might live longer
than the underlying toast-table data.  Maybe we need a forcible detoast
in evaluate_function().

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] First steps with 8.3 and autovacuum launcher

2007-10-11 Thread Michael Paesold


Simon Riggs wrote:

After some thought, you and Michael have persuaded me that there is
cause to do this for VACUUM as well, but just autovacuum, I think. That
also makes the patch simpler, since we don't need to delve inside the av
worker to see what it is doing.

Alvaro: That means we can just skip your patch altogether, or at least
we can discuss them separately now.

...

The only danger I can see is that the autovacuum is always killed and
never gets to finish, leading to degrading performance at first and
shutdown to prevent xid wraparound at the extreme. Doesn't seem likely
under normal circumstances, though. 


Yeh agreed. Table locks aren't that common, so I think we are safe for
100s of millions of transactions. The user has log messages to warn of
that, so I think we're good.


Hmm, I am not sure we are there, yet. Autovacuum does take extra care to 
vacuum tables nearing xid wrap-around, right? It even does so when 
autovacuum is disabled in the configuration.


So in case a vacuum is needed for that very reason, the vacuum should *not* 
be canceled, of course. So we don't really need the information, whether 
the AV worker is doing VACUUM or ANALYZE, but whether it is critical 
against xid wrap-around. Could that be done as easily as in Alvaro's patch 
for distinguishing vacuum/analyze? Alvaro?


The other thing I am wondering about is, whether it would be a safer 
approach to let the DBA decide whether to cancel AV vacuums or just disable 
cost-delay, as Heikki suggested. There might be valid work-loads for both 
options...


Btw., I am grateful you took up the work here, Simon.

Best Regards
Michael Paesold

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Some questions about mammoth replication

2007-10-11 Thread Hannu Krosing

Ühel kenal päeval, N, 2007-10-11 kell 18:25, kirjutas Alexey Klyukin:
> Hello,
> 
> Hannu Krosing wrote:
> > 
> > Here come my questions :
> > 
> > >From looking at http://www.commandprompt.com/images/MR_components.jpg it
> > seems that you don't do replication just from WAL logs, but also collect
> > some extra info inside postgreSQL server. Is this so ?
> >
> > If it is, then in what way does it differ from simple trigger-based
> > change logging ?
> 
> We have hooks in executor calling our own collecting functions, so we
> don't need the trigger machinery to launch replication.

But where do you store the collected info - in your own replication_log
table, or do reuse data in WAL you extract it on master befor
replication to slave (or on slave after moving the WAL) ?

> > Do you make use of snapshot data, to make sure, what parts of WAL log
> > are worth migrating to slaves , or do you just apply everything in WAL
> > in separate transactions and abort if you find out that original
> > transaction aborted ?
> 
> We check if a data transaction is recorded in WAL before sending
> it to a slave. For an aborted transaction we just discard all data collected 
> from that transaction.

Do you duplicate postgresql's MVCC code for that, or will this happen
automatically via using MVCC itself for collected data ?

How do you handle really large inserts/updates/deletes, which change say 10M 
rows in one transaction ?

> > Are your slaves a) standby b) read-only or c) read-write ?
> 
> Replicated relations are read-only on slaves.
> 
> > 
> > Do you extract / generate full sql DML queries from data in WAL logs, or
> > do you apply the changes at some lower level ?
> 
> We replicate the binary data along with a command type. Only the data
> necessary to replay the command on a slave are replicated.

Do you replay it as SQL insert/update/delete commands, or directly on
heap/indexes ?


Hannu







---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Gregory Stark

"Trevor Talbot" <[EMAIL PROTECTED]> writes:

>> 2) Specific moment in time
>>(i.e. stored in UTC which is unaffected by time zone rules)
>>
>> 3) Specified time of day in specified time zone
>>(equivalent to #2 except when the time zone rules change)
>
>> Surely #2 is a must-have. There has to be a data type for representing a 
>> fixed
>> moment in time unaffected by any time zone rules. Anything recording events 
>> --
>> which of course occurred at a specific moment in time -- needs it and there
>> are a whole lot of databases which do just that. Actually in my experience
>> most tables have one or sometimes more timestamps of that nature.
>
> While I agree that UTC storage is definitely a needed option, I was
> trying to point out in the scenario above that sometimes an event
> recorded at a specific moment in time *is* local time.  Birth
> certificates aren't in UTC.  Usually there's no practical difference,
> but there can be a semantic difference.

Thinking of it as UTC is the wrong way to think about it. A birth occurred at
a specific moment in time. You want to record that precise moment, not what it
happened to show on the clock at the time. If the clock turns out to have been
in the wrong timezone the birth isn't going to move.

The use case for storing a local timestamp with a timezone attached is for
things like appointments. If the time zone rules change you would want the
appointment to move with them, not to stay at the same moment in time.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] Cancelling Blocking Autovacuums

2007-10-11 Thread Simon Riggs

After much thought and discussion, here is my proposal of how to handle
autovacuum workers which block out user-initiated SQL statements.

Autovacuum workers running VACUUM, VACUUM ANALYZE and ANALYZE can give
problems by blocking out other users in various circumstances. There are
good cases for all 3 statements to be cancelled, so we don't bother to
enquire what exactly the AV worker is actually doing before we cancel
it. If an AV worker task is cancelled it will be scheduled to run again
on the next cycle. It is possible, though unlikely that this will result
in indefinite starvation; in that case existing log message will warn
the user.

We check for blocking AV workers when we enter the deadlock detector. At
this stage we must allow the deadlock detector to proceed to its
conclusion in case any hard or soft deadlocks exist. It is possible that
our lock queue waits on an AV worker *and* we are part of a deadlock
with a third process, so we must just let the detector roll.

If we find no deadlock, yet the head of the wait queue is an AV worker,
then we send a cancel (SIGINT) to that process. It's possible for
multiple backends to do this, though that shouldn't be a problem.

If an AV worker is not a current lock holder we wait normally, since it
is presumed the user knows about other conflicting lock holders and
would have to have waited anyway. In that case the AV worker is not
cancelled. That aspect is changeable with some thought.

A patch that implements this is being sent to -patches now.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Andrew Dunstan




andy wrote:


Do I need to worry about sed with window's users?




yes.

Perl is probably more common in Windows, and should be able to do 
everything sed can. It's also required for doing Windows/MSVC builds.


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Tom Lane

=?ISO-8859-1?Q?Magne_M=E6hre?= <[EMAIL PROTECTED]> writes:
> Correct me if I'm wrong, but IIRC there is no universally accepted 
> canonical list of time zone names (labels).

Yeah; we have an agreed-on list of names for the purposes of PG, namely
the names shown by pg_timezone_names, but that list changes from time to
time.  What's worse, it might be different on some platforms than others
even for the same PG release, since some distros override our zic
database with the system's own.

So one of the problems that would have to be surmounted to have a
datatype that embeds a zone ID is to figure out a persistent, but
compact, ID to assign to each zone ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread andy


Tom Lane wrote:

andy <[EMAIL PROTECTED]> writes:

the operator = is not the 'normal =' is it?  Its the 'tsearch2 =', right?


That one probably is, but how is your sed script going to distinguish it
from other user-defined '=' operators that might be in the dump?


Do I need to worry about sed with window's users?


I think sed is available but not normally installed on Windows.
Unfortunately the same could be said of any other tool you might choose,
so that's probably not a reason not to use it.

regards, tom lane


Oh man... Ok, do you want to go as far as extracting just one operator, 
and pulling out its PROCEDURE name?


For one of the ='s, I put just its line to the file x:

1122; 2617 98020 OPERATOR public = andy


Then ran:
[EMAIL PROTECTED]:/pub/back$ pg_restore -Fc -L x vcstimes.bak
--
-- PostgreSQL database dump
--

SET client_encoding = 'LATIN1';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;

SET search_path = public, pg_catalog;

--
-- Name: =; Type: OPERATOR; Schema: public; Owner: andy
--

CREATE OPERATOR = (
PROCEDURE = tsquery_eq,
LEFTARG = tsquery,
RIGHTARG = tsquery,
COMMUTATOR = =,
NEGATOR = <>,
MERGES,
RESTRICT = eqsel,
JOIN = eqjoinsel
);


ALTER OPERATOR public.= (tsquery, tsquery) OWNER TO andy;

--
-- PostgreSQL database dump complete
--

I could grep out the PROCEDURE line and see if it looks tsearch2'ish. 
If you want to go that route, its starting to sound beyond sed, would 
perl be ok?


-Andy

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Tom Lane

andy <[EMAIL PROTECTED]> writes:
> the operator = is not the 'normal =' is it?  Its the 'tsearch2 =', right?

That one probably is, but how is your sed script going to distinguish it
from other user-defined '=' operators that might be in the dump?

> Do I need to worry about sed with window's users?

I think sed is available but not normally installed on Windows.
Unfortunately the same could be said of any other tool you might choose,
so that's probably not a reason not to use it.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread andy


Florian G. Pflug wrote:
I'm not really a tsearch user (just played with it a bit once). But I 
wondered if you are aware that you can prevent certain objects from 
being restored
quite easiy if you use pg_dump and pg_restore together with "custom 
format" (-Fc). There is some option to pg_restore that reads the dump, 
and ouputs a table of contents. You can then remove some entries from 
that list, and pass the modified list to pg_restore which will skip 
entries that do not show up on your modified list.


Maybe we could document some regexp, awk script, or similar that strips 
the tsearch stuff from such a table of contents?


regards, Florian Pflug



This has worked well.  I have a little sed script that deletes all the 
tsearch2 looking stuff, then the restore works ok (except for the 
trigger, which is ok).


However, am I safe not restoring all these operators?

1126; 2617 98028 OPERATOR public !! andy
1124; 2617 98024 OPERATOR public && andy
1112; 2617 98003 OPERATOR public < andy
1118; 2617 98017 OPERATOR public < andy
1113; 2617 98004 OPERATOR public <= andy
1119; 2617 98018 OPERATOR public <= andy
1117; 2617 98005 OPERATOR public <> andy
1123; 2617 98019 OPERATOR public <> andy
1128; 2617 98036 OPERATOR public <@ andy
1116; 2617 98006 OPERATOR public = andy
1122; 2617 98020 OPERATOR public = andy
1115; 2617 98001 OPERATOR public > andy
1121; 2617 98015 OPERATOR public > andy
1114; 2617 98002 OPERATOR public >= andy
1120; 2617 98016 OPERATOR public >= andy
1129; 2617 98039 OPERATOR public @ andy
1127; 2617 98037 OPERATOR public @> andy
; 2617 97955 OPERATOR public @@ andy
1110; 2617 97956 OPERATOR public @@ andy
1132; 2617 98055 OPERATOR public @@@ andy
1131; 2617 98056 OPERATOR public @@@ andy
1109; 2617 97941 OPERATOR public || andy
1125; 2617 98026 OPERATOR public || andy
1130; 2617 98038 OPERATOR public ~ andy

the operator = is not the 'normal =' is it?  Its the 'tsearch2 =', right?

Do I need to worry about sed with window's users?


-Andy

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Trevor Talbot

On 10/11/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
> "Tom Lane" <[EMAIL PROTECTED]> writes:
>
> > "Trevor Talbot" <[EMAIL PROTECTED]> writes:
> >> On 10/11/07, Magne Mæhre <[EMAIL PROTECTED]> wrote:
> >>> Trevor Talbot wrote:
>  That situation might sound a bit contrived, but I think the real point
>  is that even for some records of observed times, the local time is the
>  authoritative one, not UTC.
> >>>
> >>> ...and for that scenario you have TIMESTAMP WITHOUT TIME ZONE
> >
> >> But that doesn't give you DST-sensitive display for free, which is
> >> tempting for application use, especially if the application is meant
> >> to be suitably generic.
> >
> > If you are dealing only in local time, what do you need timezone for at
> > all?

October 29, 2006, 1:15am: PDT or PST?

Even if you ignore overlap points like that, DST status is a piece of
semantic information the human retrieving the data may want to know.
It doesn't make much sense for an app to avoid the database's
perfectly good knowledge of the local timezone to get it.

> > Also note the possibility of coercing one type to the other on-the-fly
> > for display, or using the AT TIME ZONE construct.

Sure, but that's simply a workaround like tagging different zones
yourself is.  This single case isn't terribly important, it's just a
non-future-appointment one where remembering the local zone makes
sense.

If we change it a bit so that it regularly transports data to a
central office, you still want to know what time zone it belongs to.
Right now, the local office's zone rules matter because you need it to
convert to UTC properly.  Instead, it should be the central office's
zone rules that matter for temporary conversion and reporting, because
you really don't want the original data changed at all.  The original
data is the legitimate record, not the conversion to UTC.

This can all be done manually by applications today, of course.  It
would just be nice to take advantage of PostgreSQL's time zone
knowledge more easily in these situations.

> 2) Specific moment in time
>(i.e. stored in UTC which is unaffected by time zone rules)
>
> 3) Specified time of day in specified time zone
>(equivalent to #2 except when the time zone rules change)

> Surely #2 is a must-have. There has to be a data type for representing a fixed
> moment in time unaffected by any time zone rules. Anything recording events --
> which of course occurred at a specific moment in time -- needs it and there
> are a whole lot of databases which do just that. Actually in my experience
> most tables have one or sometimes more timestamps of that nature.

While I agree that UTC storage is definitely a needed option, I was
trying to point out in the scenario above that sometimes an event
recorded at a specific moment in time *is* local time.  Birth
certificates aren't in UTC.  Usually there's no practical difference,
but there can be a semantic difference.

> The lack of #3 doesn't seem terribly pressing given how rarely the time zone
> rules change. Even with the latest shenanigans I don't think anyone's run into
> any unexpected problems.

The link I posted upthread was someone who ran into something
unexpected.  There wasn't enough detail to figure out what, exactly,
just that something related to zones changed and surprised them.

And no, I don't think it's urgent either; the current behavior is
known and fairly easy to understand.  It's just that some applications
need a different set of semantics.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Artificially increase TransactionID?

2007-10-11 Thread Robert A. Klahn


Works perfectly. I did need to artificially create pg_clog segments.

Tom: Thanks for the quick response.

Bob.


On Oct 10, 2007, at 8:46 PM, Tom Lane wrote:


"Robert A. Klahn" <[EMAIL PROTECTED]> writes:

I am interested in increasing the PostgreSQL TransactionID, as part
of testing a (yet another) replication system that I am working on.
(http://bruce.codehaus.org/ for the interested). I would like to test
what happens when the transactionID crosses 2^31 and when it wraps
around at 2^32-1. Other than running "select now()" 2^31 times, is
there a practical way to raise the transactionID by large amounts?


Shut down postmaster, use pg_resetxlog to set NextXID to whatever you
want, restart and test.  You might need to artificially create pg_clog
and pg_subtrans segments matching the chosen starting point, but other
than that there's not a lot of memory of latest XID in the system.

Bumping it by more than 2G-less-reasonable-safety-margin is unfair of
course ...

regards, tom lane




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] Some questions about mammoth replication

2007-10-11 Thread Joshua D. Drake

On Thu, 11 Oct 2007 19:10:18 +0300
Alexey Klyukin <[EMAIL PROTECTED]> wrote:

> Marko Kreen wrote:
> > On 10/11/07, Alexey Klyukin <[EMAIL PROTECTED]> wrote:
> > > Hannu Krosing wrote:
> > > > For what use cases do you think your WAL-based approach is
> > > > better than Slony/Skytools trigger-based one ?
> > >
> > > A pure trigger based approach can only replicate data for the
> > > commands which fire triggers. AFAIK Slony is unable to replicate
> > > TRUNCATE command (I don't know if Skytools can). Replicator
> > > doesn't have this limitation.
> > 
> > No, Skytools is same as Slony.
> > 
> > Can you also replicate changes to system tables?

> 
> No, we need a table to have primary key to be able to replicate it.
> From the other hand replicating a system relation can be dangerous,
> i.e. what if you replicate the contents of pg_class without
> corresponing relations on the slave, that's why explicitly forbid
> enabling replication for relations from pg_catalog namespace.

We can however replicate large objects and specific system commands
like GRANT/REVOKE and CREATE/DROP user. The latter being done via
triggers.

Joshua D. Drake

> 
> Regards,

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

signature.asc
Description: PGP signature

Re: [HACKERS] Skytools committed without hackers discussion/review

2007-10-11 Thread Bruce Momjian

Magnus Hagander wrote:
> 
> > > The results have nothing to do with whether the process was followed.
> > > We do not ignore process violations just because the outcome was OK.
> >
> > Agreed. But reversing something that came out OK for no other reason
> > than that the process was violated? I know you don't, but some people
> > are asking for exactly that.
> 
> So as long as something is committed, and only breaks certain (for now
> unnamed) platforms (until fixed that is), then procedure doesn't apply.

No, if someone does this again they are going to be raked over the coals
like Jan was.  That is probably enough to keep people following procedure.

--
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Nikolay Samokhvalov

On 10/11/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Nikolay Samokhvalov" <[EMAIL PROTECTED]> writes:
> > On 10/11/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> >> Segfaults?  That shouldn't happen.  Please show a test case.
>
> > Test case: use old tsearch2.so to register all tsearch2 functions to
> > "tsearch2" schema (old fashioned way). Then try:
>
> How did you get 8.3 to load the old .so at all?  It should have the
> wrong PG_MODULE_MAGIC values.

Ooops, it's my mistake, sorry. It was CVS version of contrib/tsearch2.
So, users shouldn't encounter with this problem.

-- 
Best regards,
Nikolay

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Some questions about mammoth replication

2007-10-11 Thread Alexey Klyukin

Marko Kreen wrote:
> On 10/11/07, Alexey Klyukin <[EMAIL PROTECTED]> wrote:
> > Hannu Krosing wrote:
> > > For what use cases do you think your WAL-based approach is better than
> > > Slony/Skytools trigger-based one ?
> >
> > A pure trigger based approach can only replicate data for the commands
> > which fire triggers. AFAIK Slony is unable to replicate TRUNCATE
> > command (I don't know if Skytools can). Replicator doesn't have this
> > limitation.
> 
> No, Skytools is same as Slony.
> 
> Can you also replicate changes to system tables?

No, we need a table to have primary key to be able to replicate it. From
the other hand replicating a system relation can be dangerous, i.e. what if
you replicate the contents of pg_class without corresponing relations on the 
slave, that's why explicitly forbid enabling replication for relations from 
pg_catalog namespace.

Regards,
-- 
Alexey Klyukin http://www.commandprompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Nikolay Samokhvalov

On 10/11/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Nikolay Samokhvalov" <[EMAIL PROTECTED]> writes:
> > During restoration to 8.3 I've catched segfaults -- during INSERTs
> > into tables with "tsearch2"."tsvector" columns.
>
> Segfaults?  That shouldn't happen.  Please show a test case.

Test case: use old tsearch2.so to register all tsearch2 functions to
"tsearch2" schema (old fashioned way). Then try:


nikolay=# select '111'::tsearch2.tsvector;
ERROR:  invalid memory alloc request size 1878491112
nikolay=# select '111'::tsearch2.tsvector;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

Backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x082c2d91 in gettoken_tsvector (state=0xbfae77cc, strval=0x100,
lenval=0x0, pos_ptr=0xbfae7a1c, poslen=0x0, endptr=0x0) at
tsvector_parser.c:209
209 RETURN_TOKEN;
(gdb) bt
#0  0x082c2d91 in gettoken_tsvector (state=0xbfae77cc, strval=0x100,
lenval=0x0, pos_ptr=0xbfae7a1c, poslen=0x0, endptr=0x0) at
tsvector_parser.c:209
#1  0xa730d85f in tsvector_in () from /test/lib/postgresql/tsearch2.so
#2  0x082dda5f in InputFunctionCall (flinfo=0x0, str=0x8479c00 "111",
typioparam=61357, typmod=-1) at fmgr.c:1835
#3  0x082dfe26 in OidInputFunctionCall (functionId=61358,
str=0x8479c00 "111", typioparam=61357, typmod=-1) at fmgr.c:1939
#4  0x081317a7 in stringTypeDatum (tp=0xa732eef8, string=0x8479c00
"111", atttypmod=-1) at parse_type.c:462
#5  0x081343bd in coerce_type (pstate=0x8479bb4, node=0x8479ec0,
inputTypeId=705, targetTypeId=61357, targetTypeMod=-1,
ccontext=COERCION_EXPLICIT,
cformat=COERCE_EXPLICIT_CAST) at parse_coerce.c:210
#6  0x08134b8c in coerce_to_target_type (pstate=0x8479bb4,
expr=0x8479ec0, exprtype=705, targettype=61357, targettypmod=-1,
ccontext=COERCION_EXPLICIT,
cformat=COERCE_EXPLICIT_CAST) at parse_coerce.c:81
#7  0x081279d3 in typecast_expression (pstate=0x8479bb4,
expr=0x8479ec0, typename=0x8479cd4) at parse_expr.c:2221
#8  0x0812872b in transformExpr (pstate=0x8479bb4, expr=0x8479d00) at
parse_expr.c:150
#9  0x081369fc in transformTargetEntry (pstate=0x8479bb4,
node=0x8479d00, expr=0x0, colname=0x0, resjunk=0 '\0') at
parse_target.c:74
#10 0x08136ed4 in transformTargetList (pstate=0x8479bb4,
targetlist=0x8479d5c) at parse_target.c:146
#11 0x0810f188 in transformStmt (pstate=0x8479bb4,
parseTree=0x8479d78) at analyze.c:695
#12 0x0811103f in parse_analyze (parseTree=0x8479d78,
sourceText=0x847939c "select '111'::tsearch2.tsvector;",
paramTypes=0x0, numParams=0) at analyze.c:96
#13 0x0822e00e in pg_analyze_and_rewrite (parsetree=0x8479d78,
query_string=0x847939c "select '111'::tsearch2.tsvector;",
paramTypes=0x0, numParams=0)
at postgres.c:596
#14 0x0822e1b9 in exec_simple_query (query_string=0x847939c "select
'111'::tsearch2.tsvector;") at postgres.c:899
#15 0x0822fde6 in PostgresMain (argc=4, argv=,
username=0x841f508 "nikolay") at postgres.c:3529
#16 0x081ff938 in ServerLoop () at postmaster.c:3181
#17 0x08200656 in PostmasterMain (argc=3, argv=0x841b878) at postmaster.c:1027
#18 0x081b34c0 in main (argc=3, argv=0xbfae7a4c) at main.c:188


I don't think that it's abnormal, because segfault was caused by old
.so. Nothing wrong, right? But what we should worry about is the fact
that some users will follow the same way I did and will have this
segfault too... Maybe we should notice that one should remove old
tsearch.so during upgrade process (the case when one runs 8.2 and 8.3
on the same machine).

-- 
Best regards,
Nikolay

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Tom Lane

"Nikolay Samokhvalov" <[EMAIL PROTECTED]> writes:
> On 10/11/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>> Segfaults?  That shouldn't happen.  Please show a test case.

> Test case: use old tsearch2.so to register all tsearch2 functions to
> "tsearch2" schema (old fashioned way). Then try:

How did you get 8.3 to load the old .so at all?  It should have the
wrong PG_MODULE_MAGIC values.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Some questions about mammoth replication

2007-10-11 Thread Marko Kreen

On 10/11/07, Alexey Klyukin <[EMAIL PROTECTED]> wrote:
> Hannu Krosing wrote:
> > For what use cases do you think your WAL-based approach is better than
> > Slony/Skytools trigger-based one ?
>
> A pure trigger based approach can only replicate data for the commands
> which fire triggers. AFAIK Slony is unable to replicate TRUNCATE
> command (I don't know if Skytools can). Replicator doesn't have this
> limitation.

No, Skytools is same as Slony.

Can you also replicate changes to system tables?

-- 
marko

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Some questions about mammoth replication

2007-10-11 Thread Andreas Pflug

Alexey Klyukin wrote:
>
>   
>> For what use cases do you think your WAL-based approach is better than
>> Slony/Skytools trigger-based one ?
>> 
>
> A pure trigger based approach can only replicate data for the commands
> which fire triggers. AFAIK Slony is unable to replicate TRUNCATE
> command
It could be wrapped with ddl_script which obviously isn't transparent to
the application, but I guess a table that's truncated regularly won't be
a typical candidate  for (async) replication either.

Regards,
Andreas


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Some questions about mammoth replication

2007-10-11 Thread Alexey Klyukin

Hello,

Hannu Krosing wrote:
> 
> Here come my questions :
> 
> >From looking at http://www.commandprompt.com/images/MR_components.jpg it
> seems that you don't do replication just from WAL logs, but also collect
> some extra info inside postgreSQL server. Is this so ?
>
> If it is, then in what way does it differ from simple trigger-based
> change logging ?

We have hooks in executor calling our own collecting functions, so we
don't need the trigger machinery to launch replication.

> Do you make use of snapshot data, to make sure, what parts of WAL log
> are worth migrating to slaves , or do you just apply everything in WAL
> in separate transactions and abort if you find out that original
> transaction aborted ?

We check if a data transaction is recorded in WAL before sending
it to a slave. For an aborted transaction we just discard all data collected 
from that transaction.

> 
> Are your slaves a) standby b) read-only or c) read-write ?

Replicated relations are read-only on slaves.

> 
> Do you extract / generate full sql DML queries from data in WAL logs, or
> do you apply the changes at some lower level ?

We replicate the binary data along with a command type. Only the data
necessary to replay the command on a slave are replicated.

> 
> For what use cases do you think your WAL-based approach is better than
> Slony/Skytools trigger-based one ?

A pure trigger based approach can only replicate data for the commands
which fire triggers. AFAIK Slony is unable to replicate TRUNCATE
command (I don't know if Skytools can). Replicator doesn't have this
limitation.

Regards,
-- 
Alexey Klyukin http://www.commandprompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Plan invalidation vs temp sequences

2007-10-11 Thread Tom Lane

I wrote:
> Well, we *have* the sequence's Oid in the regclass constant, the problem
> is the difficulty of digging through the plan tree to find it.  I did
> consider having the planner extract it and save it aside somewhere, but
> there doesn't seem to be any very convenient place to do that, short of
> an extra traversal of the query tree, which is pretty annoying/expensive
> for data that will probably never be needed for most queries.

Actually ... now that I've consumed a bit more caffeine, it seems this
could be done relatively cheaply in setrefs.c.  We could add a
list-of-relation-OIDs to PlannedStmt, and charge setrefs.c with creating
the list, and simplify plancache.c to just use list_member_oid() instead
of groveling over the rangetable for itself.  I'll go try that out.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Tom Lane

"Nikolay Samokhvalov" <[EMAIL PROTECTED]> writes:
> During restoration to 8.3 I've catched segfaults -- during INSERTs
> into tables with "tsearch2"."tsvector" columns.

Segfaults?  That shouldn't happen.  Please show a test case.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

[HACKERS] Patch: txid in core

2007-10-11 Thread Marko Kreen

Just in case there is initdb required in beta2, here is patch
that adds txid into core.

Otherwise please register this as submission to 8.4.  I'd like to
avoid any process related discussions in the future...

It is syned with the latest patch I sent to -patches.

The docs are minimal, but I think first the /contrib doc
should be fleshed out more.

-- 
marko


txid-core-v1.diff.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] random dataset generator for SKYLINE operator

2007-10-11 Thread Hannes Eder


We wrote a little contrib module, which we'd like to share. It can be
used to generate random datasets as they have been used in
[Borzsonyi2001] and related work. The code is based directly on the
code of the authors, thanks to Donald Kossmann for sharing the
code. Donald Kossmann agrees on sharing our implementation with the
community.

We use it primary for profiling and regression testing of our
implementation of the SKYLINE operator in PostgreSQL, which is work in
progress, details will be announced soon.

We use it in two ways:

(1) create table a2d1000 as (select * from pg_rand_dataset(‘anti’, 2,
1000, 0) ds(id int, d1 float, d2 float)). And then work with this table,
e.g. an additional index and query the data or

(2) directly for profiling and regression testing as: select * from
pg_rand_dataset(‘indep’, 3, 1000, 0) ds(id int, d1 float, d2 float, d3
float) skyline by d1 min, d2 min, d3 max;

It might be useful for other purposes.

We ask you for comments and suggestions.

What's the best way to share this module?


Declaration
---

CREATE FUNCTION pg_catalog.pg_rand_dataset(disttype text, dim int,
rows int, seed int)
RETURNS setof record
AS '$libdir/randdataset', 'pg_rand_dataset'
LANGUAGE C IMMUTABLE STRICT;


Usage example
-

Hannes=# select * from pg_rand_dataset('indep',1,5,0) ds(id int, d1 float);
id | d1
+
1 | 0.170828036112165
2 | 0.749901980510867
3 | 0.0963716553972902
4 | 0.870465227342427
5 | 0.577303506702792
(5 rows)

Hannes=# select * from pg_rand_dataset('corr',2,3,0) ds(id int, d1 
float, d2 float);

id | d1 | d2
+---+---
1 | 0.489722997173791 | 0.431007019449241
2 | 0.385624871971487 | 0.645283734523713
3 | 0.754378154805565 | 0.82440492311745
(3 rows)

Hannes=# select * from pg_rand_dataset('anti',3,2,0) ds(id int, d1 
float, d2 float, d3 float);

id | d1 | d2 | d3
+---++---
1 | 0.848072555389202 | 0.0868831583602555 | 0.656344708211334
2 | 0.447278565604571 | 0.641176215525044 | 0.523196923510816
(2 rows)


References
--

[Borzsonyi2001] Börzsönyi, S.; Kossmann, D. & Stocker, K.:
The Skyline Operator, ICDE 2001, 421--432


Best,
-Hannes


randdataset.tar.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Pavel Stehule

I working on binary compatible library with tsearch2, which should be
usable for all users who has default configuration of tsearch2. I
hope, I send patch before morning

Pavel

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Nikolay Samokhvalov

On 10/11/07, Florian G. Pflug <[EMAIL PROTECTED]> wrote:
>
> Maybe we could document some regexp, awk script, or similar that strips the
> tsearch stuff from such a table of contents?

Just my .02c for those who will work on migration manual.

In my case, all tsearch2 stuff was kept (before 8.3) in separate
schema, namely "tsearch2". So, in 8.2, I had tsearch2.tsvector and
tsearch2.tsquery data types and so on.

During restoration to 8.3 I've catched segfaults -- during INSERTs
into tables with "tsearch2"."tsvector" columns. What helped me is the
following procedure:
  1. restore schema only;
  2. restore data with replacing "tsearch2"."tsvector" datatype to "tsvector":
sed -e 's/tsearch2\.tsvector/tsvector/g' DATADUMP | psql DBNAME
2>restoration_errors.log
  3. drop "tsearch2" schema since it isn't needed anymore.

After that -- everything works normally.
My case is specific since I use separate schemas for every single
contrib module.
-- 
Best regards,
Nikolay

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Oleg Bartunov


On Thu, 11 Oct 2007, andy wrote:


Oleg Bartunov wrote:

Andy,

seems you're a right person for writing migration guide.

Oleg
On Wed, 10 Oct 2007, andy wrote:



Where would be an easy place to find all the renamed functions?


My incomplete list:
http://www.sai.msu.su/~megera/wiki/Tsearch2_83_changes

psql \h shows syntax of text search commands

\h create text search
\h alter text search
\h drop text search



My experience with fts is limited to one project, and I just used all the 
default dictionaries, so I've never even played with any of that.


That being said, I'd be happy to write up what I can (and I can try to put 
together a list of the renamed functions).


Is there any chance there is an easier way to backup/restore?  On one hand, 
its not too bad, and it'll only be once (correct?).  Now that fts is in core 
future backup/restores will work, right?  I think it's analogous to telling 
someone they are updating from tsearch2 to tsearch3, and it might be a little 
more painful than just a backup/restore.


On the other hand I think a backup/restore will pollute the new db with a 
bunch of functions and types that wont ever be used, so it's so much cleaner 
to build it by hand.


Are there other fts users that might have opinions on that?

-Andy

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
 choose an index scan if your joining column's datatypes do not
 match



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Plan invalidation vs temp sequences

2007-10-11 Thread Tom Lane

"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> Gregory Stark wrote:
>> Given that sequences are in fact relations is there some way to work around
>> the issue at least in this case by stuffing the sequence's relid someplace
>> which the plan invalldation code can check for it?

Well, we *have* the sequence's Oid in the regclass constant, the problem
is the difficulty of digging through the plan tree to find it.  I did
consider having the planner extract it and save it aside somewhere, but
there doesn't seem to be any very convenient place to do that, short of
an extra traversal of the query tree, which is pretty annoying/expensive
for data that will probably never be needed for most queries.

> Hm... couldn't this be worked around by doing
> create or replace function dynamic_oid(text) returning regclass as
> 'select $1::regclass' language 'pl/pgsql' stable;
> And then writing
> nextval(dynamic_oid('mysequence')).

The cast-to-text workaround that I suggested does exactly that.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Richard Huxton


andy wrote:

Florian G. Pflug wrote:
Maybe we could document some regexp, awk script, or similar that 
strips the tsearch stuff from such a table of contents?


Ahh, I did not know that... I'll try that out and see if I can come up 
with something.  Thanks!


If you hack the old tsearch2.sql install script you can change the 
schema it installs to. That should make it easier to identify everything 
it installs.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> "Trevor Talbot" <[EMAIL PROTECTED]> writes:
>> On 10/11/07, Magne M=E6hre <[EMAIL PROTECTED]> wrote:
>>> Trevor Talbot wrote:
 That situation might sound a bit contrived, but I think the real point
 is that even for some records of observed times, the local time is the
 authoritative one, not UTC.
>>> 
>>> ...and for that scenario you have TIMESTAMP WITHOUT TIME ZONE
>
>> But that doesn't give you DST-sensitive display for free, which is
>> tempting for application use, especially if the application is meant
>> to be suitably generic.
>
> If you are dealing only in local time, what do you need timezone for at
> all?
>
> Also note the possibility of coercing one type to the other on-the-fly
> for display, or using the AT TIME ZONE construct.

I think there are clearly use cases for all three semantics:

1) Specified time of day in whatever the current time zone is
   (i.e. our current TIMESTAMP WITHOUT TIME ZONE)

2) Specific moment in time
   (i.e. stored in UTC which is unaffected by time zone rules)

3) Specified time of day in specified time zone
   (equivalent to #2 except when the time zone rules change)

In the SQL spec #2 and #3 are interchangeable since the time zone rules there
can never change. But in the real world as we've seen they do. 

Surely #2 is a must-have. There has to be a data type for representing a fixed
moment in time unaffected by any time zone rules. Anything recording events --
which of course occurred at a specific moment in time -- needs it and there
are a whole lot of databases which do just that. Actually in my experience
most tables have one or sometimes more timestamps of that nature.

The lack of #3 doesn't seem terribly pressing given how rarely the time zone
rules change. Even with the latest shenanigans I don't think anyone's run into
any unexpected problems.

I would say if someone implemented #3 then it would make sense to have it. It
would probably make sense for calendaring applications where the user is
manually entering a timezone and probably means that time in that timezone
even if the moment in time that it represents changes due to the rules
changing.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread andy


Florian G. Pflug wrote:

andy wrote:
Is there any chance there is an easier way to backup/restore?  On one 
hand, its not too bad, and it'll only be once (correct?).  Now that 
fts is in core future backup/restores will work, right?  I think it's 
analogous to telling someone they are updating from tsearch2 to 
tsearch3, and it might be a little more painful than just a 
backup/restore.


On the other hand I think a backup/restore will pollute the new db 
with a bunch of functions and types that wont ever be used, so it's so 
much cleaner to build it by hand.


Are there other fts users that might have opinions on that?


I'm not really a tsearch user (just played with it a bit once). But I 
wondered if you are aware that you can prevent certain objects from 
being restored
quite easiy if you use pg_dump and pg_restore together with "custom 
format" (-Fc). There is some option to pg_restore that reads the dump, 
and ouputs a table of contents. You can then remove some entries from 
that list, and pass the modified list to pg_restore which will skip 
entries that do not show up on your modified list.


Maybe we could document some regexp, awk script, or similar that strips 
the tsearch stuff from such a table of contents?


regards, Florian Pflug



Ahh, I did not know that... I'll try that out and see if I can come up 
with something.  Thanks!


-Andy


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread Florian G. Pflug


andy wrote:
Is there any chance there is an easier way to backup/restore?  On one 
hand, its not too bad, and it'll only be once (correct?).  Now that fts 
is in core future backup/restores will work, right?  I think it's 
analogous to telling someone they are updating from tsearch2 to 
tsearch3, and it might be a little more painful than just a backup/restore.


On the other hand I think a backup/restore will pollute the new db with 
a bunch of functions and types that wont ever be used, so it's so much 
cleaner to build it by hand.


Are there other fts users that might have opinions on that?


I'm not really a tsearch user (just played with it a bit once). But I wondered 
if you are aware that you can prevent certain objects from being restored
quite easiy if you use pg_dump and pg_restore together with "custom format" 
(-Fc). There is some option to pg_restore that reads the dump, and ouputs a 
table of contents. You can then remove some entries from that list, and pass the 
modified list to pg_restore which will skip entries that do not show up on your 
modified list.


Maybe we could document some regexp, awk script, or similar that strips the 
tsearch stuff from such a table of contents?


regards, Florian Pflug

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] First steps with 8.3 and autovacuum launcher

2007-10-11 Thread Gregory Stark

"Gregory Stark" <[EMAIL PROTECTED]> writes:

> "Alvaro Herrera" <[EMAIL PROTECTED]> writes:
>
>> Hmm, it looks like the race condition Heikki mentioned is the culprit.
>> We need a way to stop future analyzes from starting.  Back to the
>> drawing board ...
>
> A crazy idea I just had -- what if you roll this into the deadlock check? So
> after waiting on the lock for 1s it wakes up, finds that the holder it's
> waiting on is an autovacuum process and cancels it instead of finding no
> deadlock.

Having given this a bit of thought I think it can be made quite general:

I would assign every process a priority which could be just an unsigned
integer from 0..255 stored in PGPROC. By default each process gets 128 except
autovacuum which sets its priority to 64 and for safety I would set bgwriter
and walwriter to 255, though afaik they never acquire any user-level locks..

When DeadLockCheck traverses the waiters it's normally looking for waiters
which are blocked waiting on a lock it holds itself already. I would add that
it should also compare the priority. If any waiter ahead of itself would block
this process but has a lower priority then that should be corrected.

If the process already has acquired the lock then it should be delivered a
SIGINT. I'm not sure if that works on Windows, if not then we would have to
introduce a flag in PGPROC which would also be checked on
CHECK_FOR_INTERRUPTS.

If not then ideally I would think it should try to just push the low priority
waiter ahead of it to the back of the queue. But I'm not sure how easy that
would be. Specifically I'm not sure if we would still reliably detect real
deadlocks. It seems safest to just deliver a SIGINT in that case as well,
since in the case of autovacuum it won't matter.

This also allows for possibly having a guc which allows users to lower their
own priority. In which case their queries would be automatically killed if
they held up someone else at a normal priority.

Having a flag in PGPROC instead of delivering SIGINT is attractive for other
reasons aside from portability. It would open the door to having vacuums
(perhaps just interactive vacuums) continue but reduce or zero their
vacuum_cost_delay for example. Or perhaps they could release their analyze
locks giving up on analyzing the table but continue with the vacuuming.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Tom Lane

"Trevor Talbot" <[EMAIL PROTECTED]> writes:
> On 10/11/07, Magne M=E6hre <[EMAIL PROTECTED]> wrote:
>> Trevor Talbot wrote:
>>> That situation might sound a bit contrived, but I think the real point
>>> is that even for some records of observed times, the local time is the
>>> authoritative one, not UTC.
>> 
>> ...and for that scenario you have TIMESTAMP WITHOUT TIME ZONE

> But that doesn't give you DST-sensitive display for free, which is
> tempting for application use, especially if the application is meant
> to be suitably generic.

If you are dealing only in local time, what do you need timezone for at
all?

Also note the possibility of coercing one type to the other on-the-fly
for display, or using the AT TIME ZONE construct.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Eliminate more detoast copies for packed varlenas

2007-10-11 Thread Gregory Stark

"Gregory Stark" <[EMAIL PROTECTED]> writes:

> "Tom Lane" <[EMAIL PROTECTED]> writes:
>
>> (It might be interesting to make textin produce a packed result when
>> possible, just to see what breaks; but I would be afraid to try to do
>> that for production...)
>
> Reassuringly all checks pass with a hack like that in place. (attached)

For the record I've been doing some more testing and found one place that
could be a problem down the road. I'm not sure why it didn't show up
previously. In selfuncs.c we use VARDATA/VARSIZE on data that is taken from
parser Const nodes and from the histogram arrays without detoasting them.

Currently this is safe as array elements are not packed and parser nodes
contain values read using textin and never stored in a tuple. But down the
road I expect we'll want to pack array element so this code would need to
detoast the elements or prepare to handle packed elements.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] full text search in 8.3

2007-10-11 Thread andy


Oleg Bartunov wrote:

Andy,

seems you're a right person for writing migration guide.

Oleg
On Wed, 10 Oct 2007, andy wrote:



Where would be an easy place to find all the renamed functions?

My experience with fts is limited to one project, and I just used all 
the default dictionaries, so I've never even played with any of that.


That being said, I'd be happy to write up what I can (and I can try to 
put together a list of the renamed functions).


Is there any chance there is an easier way to backup/restore?  On one 
hand, its not too bad, and it'll only be once (correct?).  Now that fts 
is in core future backup/restores will work, right?  I think it's 
analogous to telling someone they are updating from tsearch2 to 
tsearch3, and it might be a little more painful than just a backup/restore.


On the other hand I think a backup/restore will pollute the new db with 
a bunch of functions and types that wont ever be used, so it's so much 
cleaner to build it by hand.


Are there other fts users that might have opinions on that?

-Andy

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Trevor Talbot

On 10/11/07, Magne Mæhre <[EMAIL PROTECTED]> wrote:
> Trevor Talbot wrote:
> > Thinking that it might have had out of date zone rules brings up an
> > interesting scenario though.  Consider a closed (no networking or
> > global interest) filing system in a local organization's office, where
> > it's used to record the minutes of meetings and such via human input.
> > It would seem that the correct time to record in that case is in fact
> > the local time, not UTC.  If that system is left alone for years, and
> > does not receive any zone rule updates, it will likely begin storing
> > the wrong UTC values.  When the data is later transported out
> > (upgrade, archive, whatever), it will be incorrect unless you use that
> > particular snapshot of the zone rules.
> >
> > That situation might sound a bit contrived, but I think the real point
> > is that even for some records of observed times, the local time is the
> > authoritative one, not UTC.
>
> ...and for that scenario you have TIMESTAMP WITHOUT TIME ZONE

But that doesn't give you DST-sensitive display for free, which is
tempting for application use, especially if the application is meant
to be suitably generic.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Magne Mæhre


Tom Lane wrote:


As an example, timestamptz '2007-01-01 00:00 -05' + interval '6 months'
must yield 2007-07-01 00:00 -05 according to the spec, AFAICS; but most
people living in the EST5EDT zone would prefer to get midnight -04.
There are probably some folk in South America who'd prefer midnight
-06.  (Looks at a map ... hm, maybe not, but certainly Europe vs
Africa would produce some such examples.)


Correct me if I'm wrong, but IIRC there is no universally accepted 
canonical list of time zone names (labels).  By using the name, instead
of the offset, you are not guaranteed that one database client can even 
understand the timestamp entered by another client (unless the database
contains its own definition which is backed up and restored together 
with the data).


I clearly understand the problems (having written calendar applications 
myself) of not relating the time to the specific time zone, but I think
not adhering to the standard is much worse.  If you're only interested 
in one time zone, simply use WITHOUT TIME ZONE, and you're good.  if 
you're working in multiple time zones, the arithmetic and handling will

be complex (and maybe not even deterministic).

I think the resolution and presentation of local time is best handled by 
an application level, and not by the database.


--Magne

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Magne Mæhre


Trevor Talbot wrote:

Thinking that it might have had out of date zone rules brings up an
interesting scenario though.  Consider a closed (no networking or
global interest) filing system in a local organization's office, where
it's used to record the minutes of meetings and such via human input.
It would seem that the correct time to record in that case is in fact
the local time, not UTC.  If that system is left alone for years, and
does not receive any zone rule updates, it will likely begin storing
the wrong UTC values.  When the data is later transported out
(upgrade, archive, whatever), it will be incorrect unless you use that
particular snapshot of the zone rules.

That situation might sound a bit contrived, but I think the real point
is that even for some records of observed times, the local time is the
authoritative one, not UTC.


...and for that scenario you have TIMESTAMP WITHOUT TIME ZONE


--Magne


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] Some questions about mammoth replication

2007-10-11 Thread Hannu Krosing

> 
> btw, can you publicly discuss how CommandPrompts  WAL-based
> replication works ? 

It's my company, if course I am ;)... but not in this thread. If you
are interested feel free to email me directly or start a new thread.

Good :)

Here come my questions :

>From looking at http://www.commandprompt.com/images/MR_components.jpg it
seems that you don't do replication just from WAL logs, but also collect
some extra info inside postgreSQL server. Is this so ?

If it is, then in what way does it differ from simple trigger-based
change logging ?

Do you make use of snapshot data, to make sure, what parts of WAL log
are worth migrating to slaves , or do you just apply everything in WAL
in separate transactions and abort if you find out that original
transaction aborted ?

Are your slaves a) standby b) read-only or c) read-write ?

Do you extract / generate full sql DML queries from data in WAL logs, or
do you apply the changes at some lower level ?

For what use cases do you think your WAL-based approach is better than
Slony/Skytools trigger-based one ?

--
Hannu




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Timezone database changes

2007-10-11 Thread Trevor Talbot

On 10/10/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Trevor Talbot" <[EMAIL PROTECTED]> writes:
> > Actually, what I meant at least (not sure if others meant it), is
> > storing the value in the timezone it was entered, along with what zone
> > that was.  That makes the value stable with respect to the zone it
> > belongs to, instead of being stable with respect to UTC.  When DST
> > rules change, the value is in effect "reinterpreted" as if it were
> > input using the new rules.
>
> What happens if the rules change in a way that makes the value illegal
> or ambiguous (ie, it now falls into a DST gap)?

That's a good question.  I have a vague memory of something that
absolutely needed to accept such values (as this would have to)
choosing a reasonable way to interpret them.  In the case of jumps
forward, e.g. 1:59->3:00, a time of 2:15 is assumed to be on the
previous scale, and thus interpreted as 3:15.  For overlapping times,
it picks one but I don't recall which.

Unfortunately I don't remember where I picked that up.  It might have
been a semi-standard, or it might have been someone's personal theory.

Your later example of midnight EDT + 3 months wanting to be midnight
EST is a good one, so what I said earlier about internally converting
to UTC is not something you want to do eagerly.  I'd wondered why
upthread Kevin mentioned using separate date and time types instead of
just using timestamp; now I know.  This point should go in any
documentation enhancement too.

> But perhaps more to the point, please show use-cases demonstrating that
> this behavior is more useful than the pure-UTC behavior.  For storage of
> actual time observations, I think pure-UTC is unquestionably the more
> useful.  Peter's example of a future appointment time is a possible
> counterexample, but as observed upthread it's hardly clear which
> behavior is more desirable in such a case.

Actually, it usually is, because a human picked one ahead of time.
For example, if the appointment is set for 3pm in London, the London
zone is the authoritative one, so that's what you store it in the DB
as.  If you're viewing it in NZ time, and the NZ DST rules change, so
does what you see.  If the London rules change, what you see in NZ
still changes, but what you see in London does not.

Choosing UTC in that scenario only works if the London DST rules don't
change.  Choosing the referencing timezone (London) when you store the
value works if either one changes.

If an organization is regularly scheduling such things, they might
just settle on UTC anyway to avoid confusion, in which case you store
values in UTC and get the same behavior as you do currently.

I don't know what this person was doing, but I gather sticky timezones
was preferable to them:
http://archives.postgresql.org/pgsql-general/2007-08/msg00461.php

Thinking that it might have had out of date zone rules brings up an
interesting scenario though.  Consider a closed (no networking or
global interest) filing system in a local organization's office, where
it's used to record the minutes of meetings and such via human input.
It would seem that the correct time to record in that case is in fact
the local time, not UTC.  If that system is left alone for years, and
does not receive any zone rule updates, it will likely begin storing
the wrong UTC values.  When the data is later transported out
(upgrade, archive, whatever), it will be incorrect unless you use that
particular snapshot of the zone rules.

That situation might sound a bit contrived, but I think the real point
is that even for some records of observed times, the local time is the
authoritative one, not UTC.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Plan invalidation vs temp sequences

2007-10-11 Thread Florian G. Pflug


Gregory Stark wrote:

"Tom Lane" <[EMAIL PROTECTED]> writes:


There doesn't seem to be any very nice way to fix this.  There is
not any existing support mechanism (comparable to query_tree_walker)
for scanning whole plan trees, which means that searching a cached plan
for regclass Consts is going to involve a chunk of new code no matter
how we approach it.  We might want to do that someday --- in particular,
if we ever try to extend the plan inval mechanism to react to
redefinitions of non-table objects, we'd likely need some such thing
anyway.  I'm disinclined to try to do it for 8.3 though.  The use-case
for temp sequences seems a bit narrow and there are several workarounds
(see followups to bug report), so I'm feeling this is a
fix-some-other-day kind of issue.


Given that sequences are in fact relations is there some way to work around
the issue at least in this case by stuffing the sequence's relid someplace
which the plan invalldation code can check for it?


Hm... couldn't this be worked around by doing
create or replace function dynamic_oid(text) returning regclass as
'select $1::regclass' language 'pl/pgsql' stable;
And then writing
nextval(dynamic_oid('mysequence')).

I didn't test this, but it it actually works, maybe we should just stick this
into the docs somewhere. It's probably too late to add that function to the 
backend, though...


As long as mysequence is really a temporary sequence, this wont even have
searchpath issues I think, because those are always on top of the searchpatch
anyway, aren't they?

greetings, Florian Pflug


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Including Snapshot Info with Indexes

2007-10-11 Thread Heikki Linnakangas

Gokulakannan Somasundaram wrote:
>  As explained, if we are going to include the snapshot with indexes, Vacuum
> will be done on the index independent of the table, so Vacuum will not
> depend on immutability. We need to goto the index from the table, when we
> want to update the snapshot info. The problem on hand is that some of the
> userdefined functions are mutable, whereas the user might mark it immutable.
> 
> So my idea is to have a mapping index, with tupleid as the first column and
> the function's values as subsequent columns. I have a somewhat detailed
> design in mind. So there will be a over head of extra 3 I/Os for
> update/delete on indices based on User-defined functions. But this setup
> will speed-up lot of queries where the tables are partitioned and there will
> be more inserts and selects and dropping partitions at periodic intervals.
> Updates become costly by 3 I/Os per Index with snapshot. So if someone has
> more selects than updates+deletes then this index might come handy (ofcourse
> not with user-defined functional indices).

I think you need to explain why that is better than using the Dead Space
Map. We're going to want the DSM anyway, to speed up VACUUMs; enabling
index-only-scans just came as an afterthought. While DSM designed just
for speeding up vacuums might look slightly different than one used for
index-only scans, the infrastructure is roughly the same.

What you're proposing sounds a lot more complex, less space-efficient,
and slower to update. It requires extra action from the DBA, and it
covers exactly the same use case (more selects than updates+deletes, to
use your words). It would require changes to all index access methods,
while the DSM would automatically work with all of them. In particular,
including visibility information in a bitmap index, should we have
bitmap indexes in the future, is impossible, while the DSM approach
would just work.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Plan invalidation vs temp sequences

2007-10-11 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> There doesn't seem to be any very nice way to fix this.  There is
> not any existing support mechanism (comparable to query_tree_walker)
> for scanning whole plan trees, which means that searching a cached plan
> for regclass Consts is going to involve a chunk of new code no matter
> how we approach it.  We might want to do that someday --- in particular,
> if we ever try to extend the plan inval mechanism to react to
> redefinitions of non-table objects, we'd likely need some such thing
> anyway.  I'm disinclined to try to do it for 8.3 though.  The use-case
> for temp sequences seems a bit narrow and there are several workarounds
> (see followups to bug report), so I'm feeling this is a
> fix-some-other-day kind of issue.

Given that sequences are in fact relations is there some way to work around
the issue at least in this case by stuffing the sequence's relid someplace
which the plan invalldation code can check for it?


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] quote_literal with NULL

2007-10-11 Thread Brendan Jurd

On 10/11/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> Well, it's clearly useful in INSERT and UPDATE.  For WHERE cases, you
> might or might not be able to use it, but I note that quote_nullable()
> would work much more like what happens if you use a parameter symbol
> and then bind NULL as the actual parameter value ...
>
> In hindsight we should probably have done quote_literal the way the OP
> suggests, but I concur that it's too late to change it.  An additional
> function seems a reasonable compromise.

quote_nullable() works for me.  I'll write up a patch.

Cheers,
BJ

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Plan invalidation vs temp sequences

2007-10-11 Thread Heikki Linnakangas

Tom Lane wrote:
> ... We might want to do that someday --- in particular,
> if we ever try to extend the plan inval mechanism to react to
> redefinitions of non-table objects, we'd likely need some such thing
> anyway.  I'm disinclined to try to do it for 8.3 though.  The use-case
> for temp sequences seems a bit narrow and there are several workarounds
> (see followups to bug report), so I'm feeling this is a
> fix-some-other-day kind of issue.

Agreed. I was a bit worried about this kind of usage:

CREATE OR REPLACE FUNCTION testfunc(val int) RETURNS int AS $$
DECLARE
BEGIN
  CREATE TEMPORARY SEQUENCE tempseq;
  CREATE TEMPORARY TABLE inttable (key integer DEFAULT
nextval('tempseq'), data text);
  INSERT INTO inttable (data) VALUES ('foo');
  DROP TABLE inttable;
  DROP SEQUENCE tempseq;
  return 1;
END;
$$ LANGUAGE plpgsql;

but that seems to work, because creating/dropping the temp table
triggers the plan invalidation.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Skytools committed without hackers discussion/review

2007-10-11 Thread Marko Kreen

On 10/10/07, Marko Kreen <[EMAIL PROTECTED]> wrote:
> On 10/10/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> > * Why is txid_current_snapshot() excluding subtransaction XIDs?  That
> > might be all right for the current uses in Slony/Skytools, but it seems
> > darn close to a bug for any other use.
>
> In queue usage the transaction that stores snapshots does not do
> any other work on its own, so it does not matter, main requirement
> is that txid_current()/txid_current_snapshot() be symmetric -
> whatever the txid_current() outputs, the txid_current_snapshot() measures.
>
> But I agree, supporting subtransactions makes the API more
> universal.  And it wouldn't break Slony/PgQ current usage.

I thought about it with a clear head, and am now on optinion
that the subtransactions should be left out from current API.

I really fail to imagine a scenario where it would be useful.

The module main use comes from the scenario where txid_current(),
txid_current_snapshot() and reader of them are all different
transactions.  Main guarantee that the APi makes is that as
soon you can see a inserted snapshot in table, you can also
see all inserted events in different table.

There does not seem to be any use of them intra-transaction.

Adding them currently in would be just premature bloat.

We can do it always later, when someone presents a case for
them.  Then we can also decide whether it should be added
to current API or have parallel API besides.  It should not
break Slony/Skytools either way.

-- 
marko

---(end of broadcast)---
TIP 6: explain analyze is your friend

74 matches

Mail list logo