Re: [HACKERS] New feature request: FlashBack Query

2007-03-01 Thread August Zajonc
Florian G. Pflug wrote:
 Alvaro Herrera wrote:
 Florian G. Pflug wrote:

 However, I just realized that doing this is much harder than I initially
 thought, because catalog access always happens with SnapshotNow, and
 e.g. drop table deletes datafiles at commit time, and not during
 vacuum.

 Not to mention the likenesses of CLUSTER and TRUNCATE, which would need
 to be taught how to keep the old datafiles for an additional week/hour.

 What I don't understand is why people isn't working in improving
 contrib/spi/timetravel.
 
 Because it serves different usecase I think - flashback is an
 administrative tool, not something you design your application around.
 Flashback is more similar to PITR recovery than to contrib/spi/timetravel.

Drat. I remember when truncate wasn't even transaction safe, but I think
it was since cut so that the non-rollbackable portion happened after
commit.

Ultimately, anything that changed data would need to basically deferred
into the vacuum or other cycle. Basically, super MVCC, a truncate would
basically do the tuple type action on the underlying files. Catalog
stuff too, HOT all would need those semantics. Not doable.

A lot of places that grab an AccessExclusiveLock are probably subject to
this issue.

Unless there was a bog standard way of doing this, and I don't see a
good option, no go.

So fun to think about and probably all sorts of neat things you could do
with super MVCC, TRUNCATE a table with open transactions concurrent,
but way too work for the gain of this tiny use feature...

contrib/timetravel I think has some of these same issues (ie, drop
table, can you still time travel?) along with a fair bit of trigger
based overhead...

- August

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] New feature request: FlashBack Query

2007-02-21 Thread Csaba Nagy
 Do 97% of transactions commit because Oracle has slow rollbacks and  
 developers are working around that performance issue, or because they  
 really commit?
 
 I have watched several developers that would prefer to issue numerous  
 selects to verify things like foreign keys in the application in  
 order to avoid a rollback.

Most of the code we have will not afford a rollback because it can be
part of a much bigger transaction which would have much higher
performance penalty if retried than a simple rollback. And you know that
in postgres you can't roll back just the last insert, you will crash the
whole transaction with it... and it's simply a performance bottleneck to
retry in a high contention scenario (which is usually so in our case).

So I would say we don't avoid rollbacks because of the cost of the
rollback, but because of the cost of the retry...

Cheers,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New feature request: FlashBack Query

2007-02-21 Thread Florian G. Pflug

Theo Schlossnagle wrote:


On Feb 20, 2007, at 1:40 PM, Tom Lane wrote:


RPK [EMAIL PROTECTED] writes:

I did not mean asking for undo from a life-time log. Since FlashBack
Technology is already there, I just mean that world's most advanced 
database

(PostgreSQL, as they say), must have an optimized way for undoing of at
least a week changes.


You're living in a dream world.  Do you know any Oracle DBs who keep
enough rollback segments to go back a week?


Ours go for a good 6 hours sometimes :-D


Eeven if it's just one hour, it's certainly better than nothing.
I fully agree that I'd not be acceptable to introduce performance
problems for _everyone_ by introducing flashback. But if you only
experience a drop in performance if you actually enable flashback
(by, let's say setting vacuum_min_deadtime=1h), then I don't see
why anyone would object to having support for some kind of flashback.

However, I just realized that doing this is much harder than I initially
thought, because catalog access always happens with SnapshotNow, and
e.g. drop table deletes datafiles at commit time, and not during vacuum.

Supporting begin; drop table mytable; commit; begin; set transaction 
flashback 1 hour; select * from mytable; commit would therefore be

really hard...

greetings, Florian Pflug

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New feature request: FlashBack Query

2007-02-21 Thread Alvaro Herrera
Florian G. Pflug wrote:

 However, I just realized that doing this is much harder than I initially
 thought, because catalog access always happens with SnapshotNow, and
 e.g. drop table deletes datafiles at commit time, and not during vacuum.

Not to mention the likenesses of CLUSTER and TRUNCATE, which would need
to be taught how to keep the old datafiles for an additional week/hour.

What I don't understand is why people isn't working in improving
contrib/spi/timetravel.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New feature request: FlashBack Query

2007-02-21 Thread Florian G. Pflug

Alvaro Herrera wrote:

Florian G. Pflug wrote:


However, I just realized that doing this is much harder than I initially
thought, because catalog access always happens with SnapshotNow, and
e.g. drop table deletes datafiles at commit time, and not during vacuum.


Not to mention the likenesses of CLUSTER and TRUNCATE, which would need
to be taught how to keep the old datafiles for an additional week/hour.

What I don't understand is why people isn't working in improving
contrib/spi/timetravel.


Because it serves different usecase I think - flashback is an 
administrative tool, not something you design your application around.

Flashback is more similar to PITR recovery than to contrib/spi/timetravel.

greetings, Florian Pflug

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Gregory Stark
Jonah H. Harris [EMAIL PROTECTED] writes:

 On 2/17/07, Joshua D. Drake [EMAIL PROTECTED] wrote:
 My understanding is that the main difference is that rollbacks are
 inexpensive for us, but expensive for Oracle.

 Yes, Oracle is optimized for COMMIT, we're optimized for ROLLBACK :)

I used to say that too but I've since realized it's not really true. It's more
like Oracle is optimized for data that's committed long in the past and we're
optimized for data that's been recently updated. 

In Oracle the data that's been committed long in the past requires no
transactional overhead but the data that's been recently updated requires lots
of work to fetch the right version. 

In Postgres it's the other way around. data that's been committed deleted long
ago requires extra work to clean up but data that's been recently changed
requires little additional work to see the correct version.

In a sense then it's the opposite of what we usually say. Oracle is optimized
for mostly static data. Postgres is optimized for changing data.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread RPK

I agree that TimeStamp creates an overhead, but I just want to know if an
accidental update happened to a table and this incident got traced three
days after, what facility PGSQL provide to bring the table to its original
condition. You can't wait regretting on why you did not run ROLLBACK before
COMMIT. (Correct me. I am only a user).

When talking about Oracle's technology and that it creates overhead, it is
true, Oracle's database is not for ordinary machines. You can't expect
performance on a normal 256 MB machine with Oracle. But still the more the
options of recovery the best for mission critical environments.

The feature of enabling/disabling TimeStamp logging is acceptable. A user
must be able to decide whether FlashBack type option is needed or not. In
Oracle 10g we can switch off FlashBack feature if we are low on resources.
If PGSQL is to be used in a mission-critical situation then no company will
rely on low-end machines. For these type of situations best environment is
chosen and I think PGSQL must have this type of recovery options. PGSQL
installer can ask the user during setup to enable/disable TimeStamp Logging.

Restoring the database from a backup file that was created three days ago is
not feasible. The changes in other tables and the new things created need to
be done again at the price of just undoing the last update on a particular
table.


Warren Turkal-5 wrote:
 
 On Saturday 17 February 2007 07:49, RPK wrote:
 PostgreSQL, already a mature database, needs to have more options for
 recovery as compared to proprietary databases. I just worked with
 Oracle's
 FlashBack query feature in Oracle 9i and FlashBack Table feature in 10g.

 Future versions of PostgreSQL must have similar features which enable
 users
 to bring Table(s) and/or Database(s) to a desired Time Stamp.
 
 Check out my proposal[1] for Temporal extensions. Ultimately, creating
 valid 
 time and transaction time tables would be possible through my proposal.
 Please 
 check it out.
 
 [1]http://archives.postgresql.org/pgsql-hackers/2007-02/msg00540.php
 
 wt
 -- 
 Warren Turkal (w00t)
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
 
 

-- 
View this message in context: 
http://www.nabble.com/New-feature-request%3A-FlashBack-Query-tf3245023.html#a9059865
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Andrew Dunstan
RPK wrote:

 I agree that TimeStamp creates an overhead, but I just want to know if an
 accidental update happened to a table and this incident got traced three
 days after, what facility PGSQL provide to bring the table to its original
 condition. You can't wait regretting on why you did not run ROLLBACK
 before
 COMMIT. (Correct me. I am only a user).


Why the heck can't you create a reversing transaction? That's what
ordinary mortals do. Demanding unlimited undo at some time that is
arbitrarilly distant in the future strikes me as wholly unreasonable.

What do you mean by accidental update? What you really appear to mean is
that a program or a human operator has made an error, and incorrectly told
the database to commit a transaction. The answer surely is to correct the
behaviour of the program or human, rather than wanting the database to
provide an undo facility. Alternatively, this should be handled at the
application layer, using something like table_log.

Some things just don't work well with this sort of facility. Just ask your
bookie if you can undo a bet that you accidentally placed with him and
which, three days later, you discover (after the race) was a mistake.


cheers

andrew




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Jonah H. Harris

On 2/20/07, Gregory Stark [EMAIL PROTECTED] wrote:

I used to say that too but I've since realized it's not really true.


Heh, take a joke man... I was following up on Drake's email :)

But, since you want to discuss your view of the systems openly... I'll
gladly reply :)


It's more like Oracle is optimized for data that's committed
long in the past and we're optimized for data that's
been recently updated.


Wrong.  When Oracle says it's committed, it's committed.  No
difference between when, where, and how.  In Oracle, the committed
version is *always* the first presented to the user... it takes time
to go back and look at older versions; but why shouldn't that be a bit
slower, it isn't common practice anyway.  Same with rollbacks... why
should they optimize for them when 97% of transactions commit?


In Oracle the data that's been committed long in the past requires no
transactional overhead but the data that's been recently updated requires lots
of work to fetch the right version.


Wrong.  The same transactional overhead applies to *all* data in
Oracle no matter of when it was committed.  Similarly, the only
overhead required occurs when someone is querying in serializable
isolation or on read-committed data before or during a commit.  On
short OLTP-type transactions, Oracle has the most optimized solution.


In Postgres it's the other way around. data that's been committed deleted long
ago requires extra work to clean up but data that's been recently changed
requires little additional work to see the correct version.


PostgreSQL has little additional work?  Like, checking the validity of
every tuple?  Oracle checks visibility at the block level, so there's
*much* less overhead.  Take most of the benchmarks which can hold ~200
tuples per block.  Tables in those benchmarks are 100+ million rows.
On a sequential scan, Oracle would perform 500K checks, PostgreSQL
would perform *all* 100M checks (not counting dead versions due to row
updates and the like).  On an index scan, Oracle not only has a
smaller index and less to check, but also knows the tuple will be
committed and will, in most cases, not have to perform additional
physical I/O to find the latest version of a row.

Of course, Oracle's design is much more complicated in its ability to
build read-committed versions of the blocks at runtime; something the
simplicity of PostgreSQL's MVCC design eliminates.


In a sense then it's the opposite of what we usually say. Oracle is optimized
for mostly static data. Postgres is optimized for changing data.


Care to share an example to prove it?

Like always, there are pros and cons with both designs, but denying
facts gets us nowhere.  We're off-topic now... so we should either
move this off line or to another thread.  I personally don't see much
of a reason to continue discussing MVCC designs anymore as Oracle's is
patented and PostgreSQL's is highly unlikely to change drastically.

As always, I'd suggest discussing improvements, not the status quo.
Likewise, discussing Oracle's design, drawbacks, and limitations
without having used it extensively is quite obvious to anyone familiar
with Oracle.  Don't get me wrong, it's fine to prefer one design to
another, but pushing discussion items comparing Oracle to PostgreSQL
because of things you've heard or read somewhere isn't the same as
understanding them because you've used them.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Rod Taylor


Wrong.  When Oracle says it's committed, it's committed.  No
difference between when, where, and how.  In Oracle, the committed
version is *always* the first presented to the user... it takes time
to go back and look at older versions; but why shouldn't that be a bit
slower, it isn't common practice anyway.  Same with rollbacks... why
should they optimize for them when 97% of transactions commit?


Do 97% of transactions commit because Oracle has slow rollbacks and  
developers are working around that performance issue, or because they  
really commit?


I have watched several developers that would prefer to issue numerous  
selects to verify things like foreign keys in the application in  
order to avoid a rollback.


Anyway, I don't have experience with big Oracle applications but I'm  
not so sure that 97% of transactions would commit if rollbacks were  
cheaper.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Hannu Krosing
Ühel kenal päeval, T, 2007-02-20 kell 10:20, kirjutas Jonah H. Harris:
 On 2/20/07, Gregory Stark [EMAIL PROTECTED] wrote:
  I used to say that too but I've since realized it's not really true.
 
 Heh, take a joke man... I was following up on Drake's email :)
 
 But, since you want to discuss your view of the systems openly... I'll
 gladly reply :)
 
  It's more like Oracle is optimized for data that's committed
  long in the past and we're optimized for data that's
  been recently updated.
 
 Wrong.  When Oracle says it's committed, it's committed.  No
 difference between when, where, and how.  In Oracle, the committed
 version is *always* the first presented to the user... it takes time
 to go back and look at older versions; 

Older versions are also committed :)

He probably meant longer transactions and several versions visible to
different backends.

 but why shouldn't that be a bit slower, it isn't common practice anyway.

Not for pure OLAP, at least when you have fairly fast transactions. But
it can slow things down when you have some hotspot tables.

 Same with rollbacks... why
 should they optimize for them when 97% of transactions commit?

Or other way around, - you should write code, where most transactions
commit ;)

  In Oracle the data that's been committed long in the past requires no
  transactional overhead but the data that's been recently updated requires 
  lots
  of work to fetch the right version.
 
 Wrong.  The same transactional overhead applies to *all* data in
 Oracle no matter of when it was committed.  Similarly, the only
 overhead required occurs when someone is querying in serializable
 isolation or on read-committed data before or during a commit.  On
 short OLTP-type transactions, Oracle has the most optimized solution.
 
  In Postgres it's the other way around. data that's been committed deleted 
  long
  ago requires extra work to clean up but data that's been recently changed
  requires little additional work to see the correct version.
 
 PostgreSQL has little additional work?  Like, checking the validity of
 every tuple?  Oracle checks visibility at the block level, so there's
 *much* less overhead. 

Hmm. How can it check visibility at block level and at the same time do
in-place updates on single tuples ?

  Take most of the benchmarks which can hold ~200
 tuples per block.  Tables in those benchmarks are 100+ million rows.
 On a sequential scan, Oracle would perform 500K checks, PostgreSQL
 would perform *all* 100M checks (not counting dead versions due to row
 updates and the like). 

My proposal of keeping visibility info in a separate heap would help to
get similar results, that is mostly 1 check per page. That would also
cover much of the index lookup cases below.

 On an index scan, Oracle not only has a
 smaller index and less to check, but also knows the tuple will be
 committed and will, in most cases, not have to perform additional
 physical I/O to find the latest version of a row.

It is also the reason why you can forget about doing simultaneous data
loading and queries on the same table. If you know avoid doing that,
then it will, in most cases, not have to perform additional physical
I/O to find the latest version of a row ;)

 Of course, Oracle's design is much more complicated in its ability to
 build read-committed versions of the blocks at runtime; something the
 simplicity of PostgreSQL's MVCC design eliminates.
 
  In a sense then it's the opposite of what we usually say. Oracle is 
  optimized
  for mostly static data. Postgres is optimized for changing data.
 
 Care to share an example to prove it?
 
 Like always, there are pros and cons with both designs, but denying
 facts gets us nowhere.  We're off-topic now... so we should either
 move this off line or to another thread.  I personally don't see much
 of a reason to continue discussing MVCC designs anymore as Oracle's is
 patented and PostgreSQL's is highly unlikely to change drastically.

I don't think we will ever move to rollback segments, but for some
use-cases moving visibility to a separate heap could make sense.

And if we want to bring back time travel (see another thread about
Flashback Queries), then we may end up implementing the original
postgresql's design spec and make VACUUM spihon dead tuples over to
archive relations, which already starts looking a little like rollback
segments, only for other purposes :)

 As always, I'd suggest discussing improvements, not the status quo.
 Likewise, discussing Oracle's design, drawbacks, and limitations
 without having used it extensively is quite obvious to anyone familiar
 with Oracle.

Using a system extensively can also create blind spots about some of the
systems (mis)features. One learns to avoid doing some things without
consciously knowing about it.

 Don't get me wrong, it's fine to prefer one design to
 another, but pushing discussion items comparing Oracle to PostgreSQL
 because of things you've heard or read somewhere isn't 

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Gregory Stark
Jonah H. Harris [EMAIL PROTECTED] writes:

 On 2/20/07, Gregory Stark [EMAIL PROTECTED] wrote:

 It's more like Oracle is optimized for data that's committed
 long in the past and we're optimized for data that's
 been recently updated.

 Wrong.  When Oracle says it's committed, it's committed.  No
 difference between when, where, and how.  In Oracle, the committed
 version is *always* the first presented to the user... 

Sure, and if it was committed long in the past then you can use it. If it's
committed recently then you'll have to start looking up rollback data instead.

The rest of your post seems to all be predicated on the idea that if data is
committed then that's all you'll need to look at. But that's missing precisely
the point of what I was saying:

 In a sense then it's the opposite of what we usually say. Oracle is optimized
 for mostly static data. Postgres is optimized for changing data.

By changing data I meant data in flux, not the action of making changes to
the data. 

Looking at data in flux in Oracle -- even other data that's unchanged but
lives on the same page as some record that's in flux -- will require you to
look up rollback data and possibly even have to follow many pages of chained
rollback data. Looking at data in Postgres has no additional overhead when
it's data in flux versus old static data.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread August Zajonc
Tom Lane wrote:
 August Zajonc [EMAIL PROTECTED] writes:
 The key is how lightweight the setup could be, which matters because
 clients are not always willing to pay for a PITR setup. The low overhead
 would mean you'd feel fine about setting guc to 1hr or so.
 
 This would have exactly the same performance consequences as always
 having an hour-old open transaction.  I'm afraid that describing it
 as low overhead is mere wishful thinking: it would cripple vacuuming
 of high-update tables and greatly increase the typical load on pg_clog
 and pg_subtrans.  We already know that pg_subtrans contention can be a
 source of context-swap storms, with the size of the window back to
 GlobalXmin being the controlling factor for how bad it gets.
 
 It's possible that this last could be addressed by separating the
 concept of old enough to be vacuumed from GlobalXmin, but it's
 certainly not a trivial thing.

Isn't globalxmin for open transactions? I thought the idea was that
everything goes as usual, but you can flip a knob and say that vacuum
doesn't vacuum anything more recent then GlobalXmin less x transactions.

Then you can look at your transactions per second and get a rough window
to work within. Or if there are timestamps on commits, that would switch
to a time interval more user friendly.

You end up simply delaying when 1hrs worth of transactions gets
vacuumed. For folks doing nightly cron job vacuums, not too bad.
Autovacuum isn't on by default :)

Of course, this will be clumsy if not per database.

But the thought might be to take advantage of the flashback data already
present under the MVCC model as long as vacuum hasn't hit things (and
being willing to stop activity on a database etc). Given that you are
delaying a vacuum rather then being more aggressive, and know you can
already vacuum up to a more recent transaction xmin, I dunno... Does
anything depend (other then performance) on vacuum actually vacuuming as
far as it can?

- August

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread RPK

Andrew,

 Demanding unlimited undo at some time that is arbitrarilly distant in the
 future strikes me as wholly unreasonable. 

I did not mean asking for undo from a life-time log. Since FlashBack
Technology is already there, I just mean that world's most advanced database
(PostgreSQL, as they say), must have an optimized way for undoing of at
least a week changes. A week log is enough and PostgreSQL can keep on
removing old logs automatically.

Secondly, it must be left to the user to decide for the number of days of
archive he want to store. Again upto a week max.


RPK wrote:

 I agree that TimeStamp creates an overhead, but I just want to know if an
 accidental update happened to a table and this incident got traced three
 days after, what facility PGSQL provide to bring the table to its original
 condition. You can't wait regretting on why you did not run ROLLBACK
 before
 COMMIT. (Correct me. I am only a user).


Why the heck can't you create a reversing transaction? That's what
ordinary mortals do. Demanding unlimited undo at some time that is
arbitrarilly distant in the future strikes me as wholly unreasonable.

What do you mean by accidental update? What you really appear to mean is
that a program or a human operator has made an error, and incorrectly told
the database to commit a transaction. The answer surely is to correct the
behaviour of the program or human, rather than wanting the database to
provide an undo facility. Alternatively, this should be handled at the
application layer, using something like table_log.

Some things just don't work well with this sort of facility. Just ask your
bookie if you can undo a bet that you accidentally placed with him and
which, three days later, you discover (after the race) was a mistake.


cheers

andrew




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



-- 
View this message in context: 
http://www.nabble.com/New-feature-request%3A-FlashBack-Query-tf3245023.html#a9067564
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Tom Lane
RPK [EMAIL PROTECTED] writes:
 I did not mean asking for undo from a life-time log. Since FlashBack
 Technology is already there, I just mean that world's most advanced database
 (PostgreSQL, as they say), must have an optimized way for undoing of at
 least a week changes.

You're living in a dream world.  Do you know any Oracle DBs who keep
enough rollback segments to go back a week?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Theo Schlossnagle


On Feb 20, 2007, at 1:40 PM, Tom Lane wrote:


RPK [EMAIL PROTECTED] writes:

I did not mean asking for undo from a life-time log. Since FlashBack
Technology is already there, I just mean that world's most  
advanced database
(PostgreSQL, as they say), must have an optimized way for undoing  
of at

least a week changes.


You're living in a dream world.  Do you know any Oracle DBs who keep
enough rollback segments to go back a week?


Ours go for a good 6 hours sometimes :-D

// Theo Schlossnagle
// Esoteric Curio: http://www.lethargy.org/~jesus/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Jonah H. Harris

On 2/20/07, Rod Taylor [EMAIL PROTECTED] wrote:

Do 97% of transactions commit because Oracle has slow rollbacks and
developers are working around that performance issue, or because they
really commit?


Again, off-topic, but 97% of all transactions commit according to Jim
Gray and his research... not anything related to Oracle.


I have watched several developers that would prefer to issue numerous
selects to verify things like foreign keys in the application in
order to avoid a rollback.


That's just bad development.


Anyway, I don't have experience with big Oracle applications but I'm
not so sure that 97% of transactions would commit if rollbacks were
cheaper.


Again, stats not related to Oracle, but databases in general.


--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Jonah H. Harris

On 2/20/07, Hannu Krosing [EMAIL PROTECTED] wrote:

He probably meant longer transactions and several versions visible to
different backends.


Yes, he may have... but I was responding to the statements he made.


 but why shouldn't that be a bit slower, it isn't common practice anyway.

Not for pure OLAP, at least when you have fairly fast transactions. But
it can slow things down when you have some hotspot tables.


True, but hotspots are hotspots and no matter what caused them or
where they are, they slow down performance in one area or another.
Limiting hotspots is generally an application-level design decision
anyway.


 Same with rollbacks... why
 should they optimize for them when 97% of transactions commit?

Or other way around, - you should write code, where most transactions
commit ;)


That's what I said, Oracle shouldn't optimize for rollbacks when most
transactions commit.


Hmm. How can it check visibility at block level and at the same time do
in-place updates on single tuples ?


In most cases, the block-level SCN determines transaction-level
visibility.  Now, row locks can exist within that page, but they don't
determine visibility... they determine the UNDO location which
contains the data required to rebuild a read-consistent version of the
block.


My proposal of keeping visibility info in a separate heap would help to
get similar results, that is mostly 1 check per page. That would also
cover much of the index lookup cases below.


Most definitely.


I don't think we will ever move to rollback segments, but for some
use-cases moving visibility to a separate heap could make sense.


Yes.


And if we want to bring back time travel (see another thread about
Flashback Queries), then we may end up implementing the original
postgresql's design spec and make VACUUM spihon dead tuples over to
archive relations, which already starts looking a little like rollback
segments, only for other purposes :)


Yes.


Using a system extensively can also create blind spots about some of the
systems (mis)features. One learns to avoid doing some things without
consciously knowing about it.


I've used 'em all and can certainly name issues with Oracle.  However,
we're discussing improving PostgreSQL, I was responding to Greg's
statements, and I don't see the need to bring up unrelated Oracle
implementation details which will just lead to a general anti-Oracle
discussion.


To get a really meaningful discussion we should involve someone who has
*designed* them, not merely used them .


True.  My comment is more along the lines of uninformed discussion
which leads to Oracle-bashing.  Those who have at least used and
administered Oracle in production tend to understand Oracle's design
decisions and related issues better than those who have just heard of
Oracle's issues.

I live in the real world and can admit certain failures of any
database system regardless of which I prefer.  No single database is
best for every task.  I just didn't want the discussion going where it
normally goes, to being one of, we're right and they're wrong.

Can we move offline or to another thread if we want to continue
discussing Oracle-specifics; otherwise... let's focus on
flashback-like functionality in this thread.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread August Zajonc
RPK wrote:
 Andrew,
 
 Demanding unlimited undo at some time that is arbitrarilly distant in the
 future strikes me as wholly unreasonable. 
 
 I did not mean asking for undo from a life-time log. Since FlashBack
 Technology is already there, I just mean that world's most advanced database
 (PostgreSQL, as they say), must have an optimized way for undoing of at
 least a week changes. A week log is enough and PostgreSQL can keep on
 removing old logs automatically.
 
 Secondly, it must be left to the user to decide for the number of days of
 archive he want to store. Again upto a week max.

You might look at storing delta's or similar (perhaps with a check table
) if you need to change data a week back. Then you can just find the row
representing the problematic change and delete it. If you really want to
track what happens, do deltas and then instead of deleting them, put a
reversing delta in, keeping your entire audit trail.

You can put materialized views on top of this if you need performance.

- August

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Andreas 'ads' Scherbaum

Hello,

On Sat, 17 Feb 2007 06:49:42 -0800 (PST)
RPK [EMAIL PROTECTED] wrote:

 PostgreSQL, already a mature database, needs to have more options for
 recovery as compared to proprietary databases. I just worked with Oracle's
 FlashBack query feature in Oracle 9i and FlashBack Table feature in 10g.
 
 Future versions of PostgreSQL must have similar features which enable users
 to bring Table(s) and/or Database(s) to a desired Time Stamp.

There is a pgfoundry project which tries to achieve this:

http://pgfoundry.org/projects/tablelog/


Kind regards

-- 
Andreas 'ads' Scherbaum
German PostgreSQL Usergroup: http://www.pgug.de

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Hannu Krosing
Ühel kenal päeval, P, 2007-02-18 kell 14:27, kirjutas Joshua D. Drake:
 Hannu Krosing wrote:
  Ühel kenal päeval, L, 2007-02-17 kell 22:49, kirjutas Chad Wagner:
  
 
  However, they don't have vacuum, we do.
 
  Right, and I think that is more or less because Oracle doesn't need
  it.  Vacuum's main purpose (correct me if I am wrong) is to
  recover/mark rows that are no longer used, and Oracle essentially
  reuses the space immediately. 
 
  Obviously with Oracle if you bloat out a table and delete a ton of
  rows then you have to rebuild the table, but that is more or less the
  same problem that PostgreSQL has and where vacuum full comes into
  play.
 
  The only benefit with the Oracle model is that you can achieve
  flashback, which is a very rarely used feature in my book.
  
  We can have flashbacks up to the last vacuum. It is just not exposed.
  Don't vacuum, and you have the whole history. (Actually you can't go for
  more than 2G transactions, or you get trx id rollover).
  
  To get a flashback query, you just have to construct a snapshot from
  that time and you are done. We don't store transaction times anywere, so
  the flashback has to be by transaction id, but there is very little
  extra work involved. We just don't have syntax for saying SELECT ... AS
  SEEN BY TRANSACTION XXX
 
 Well this is certainly interesting. What do we think it would take to
 enable the functionality?

First we must run the query in serializable mode and replace the
snapshot with a synthetic one, which defines visibility at the start of
the desired transaction

probably it is a good idea to take a lock on all tables involved to
avoid a vacuum to be started on them when the query is running.

also, we can't trust the DELETED flags in index pages, so we should
forbid index scans, or just always re-check the visibility in heap.

Otherways it would probably be enough to just scan tuples as usual, and
check if they were visible to desired transaction, that is they were
inserted before that transaction and they are not deleted before that
trx.

Of course this will not be true, once we have HOT/WIP with in-page
vacuuming.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Florian G. Pflug

Hannu Krosing wrote:

Ühel kenal päeval, P, 2007-02-18 kell 14:27, kirjutas Joshua D. Drake:

Hannu Krosing wrote:

Ühel kenal päeval, L, 2007-02-17 kell 22:49, kirjutas Chad Wagner:
To get a flashback query, you just have to construct a snapshot from
that time and you are done. We don't store transaction times anywere, so
the flashback has to be by transaction id, but there is very little
extra work involved. We just don't have syntax for saying SELECT ... AS
SEEN BY TRANSACTION XXX

Well this is certainly interesting. What do we think it would take to
enable the functionality?


First we must run the query in serializable mode and replace the
snapshot with a synthetic one, which defines visibility at the start of
the desired transaction

probably it is a good idea to take a lock on all tables involved to
avoid a vacuum to be started on them when the query is running.

Would the xmin exported by that transaction prevent vacuum from removing
any tuples still needed for the flashback snapshot?

greetings, Florian Pflug

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Alvaro Herrera
Florian G. Pflug escribió:
 Hannu Krosing wrote:
 Ühel kenal päeval, P, 2007-02-18 kell 14:27, kirjutas Joshua D. Drake:
 Hannu Krosing wrote:
 Ühel kenal päeval, L, 2007-02-17 kell 22:49, kirjutas Chad Wagner:
 To get a flashback query, you just have to construct a snapshot from
 that time and you are done. We don't store transaction times anywere, so
 the flashback has to be by transaction id, but there is very little
 extra work involved. We just don't have syntax for saying SELECT ... AS
 SEEN BY TRANSACTION XXX
 Well this is certainly interesting. What do we think it would take to
 enable the functionality?
 
 First we must run the query in serializable mode and replace the
 snapshot with a synthetic one, which defines visibility at the start of
 the desired transaction
 
 probably it is a good idea to take a lock on all tables involved to
 avoid a vacuum to be started on them when the query is running.
 Would the xmin exported by that transaction prevent vacuum from removing
 any tuples still needed for the flashback snapshot?

Sure, and that makes the mentioned lock unnecessary.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Zeugswetter Andreas ADI SD

  Well this is certainly interesting. What do we think it 
 would take to 
  enable the functionality?
 
 First we must run the query in serializable mode and replace 
 the snapshot with a synthetic one, which defines visibility 
 at the start of the desired transaction

We could use something that controls global xmin.
It would ensure, that global xmin does not advance bejond
what still needs to be visible. This would probably be a 
sliding time window, or a fixed point in time that is
released by the dba/user.

Then all below is not really different from a situation where
you had a long running tx.  

 probably it is a good idea to take a lock on all tables 
 involved to avoid a vacuum to be started on them when the 
 query is running.
 
 also, we can't trust the DELETED flags in index pages, so we 
 should forbid index scans, or just always re-check the 
 visibility in heap.
 
 Otherways it would probably be enough to just scan tuples as 
 usual, and check if they were visible to desired transaction, 
 that is they were inserted before that transaction and they 
 are not deleted before that trx.
 
 Of course this will not be true, once we have HOT/WIP with 
 in-page vacuuming.

Currently I think HOT does honor global xmin. There is no 
lookup for relevant xids, so parts of an update chain where
only a previous tuple or a later tuple can be visible are reused.
Else Hot would need to be told not to, in a scenario where
a backend can choose a snapshot at will.

Andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Zeugswetter Andreas ADI SD

  First we must run the query in serializable mode and replace the 
  snapshot with a synthetic one, which defines visibility at the
start 
  of the desired transaction
  
  probably it is a good idea to take a lock on all tables involved to

  avoid a vacuum to be started on them when the query is running.
  Would the xmin exported by that transaction prevent vacuum from 
  removing any tuples still needed for the flashback snapshot?
 
 Sure, and that makes the mentioned lock unnecessary.

Problem is, that that transaction sets a historic snapshot at a later
time, so it is not yet running when vacuum looks at global xmin.
So something else needs to hold up global xmin (see prev post).

Andreas

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Florian G. Pflug

Zeugswetter Andreas ADI SD wrote:
First we must run the query in serializable mode and replace the 
snapshot with a synthetic one, which defines visibility at the
start 

of the desired transaction

probably it is a good idea to take a lock on all tables involved to



avoid a vacuum to be started on them when the query is running.
Would the xmin exported by that transaction prevent vacuum from 
removing any tuples still needed for the flashback snapshot?

Sure, and that makes the mentioned lock unnecessary.


Problem is, that that transaction sets a historic snapshot at a later
time, so it is not yet running when vacuum looks at global xmin.
So something else needs to hold up global xmin (see prev post).


I think to make this flashback stuff fly, you'd need to know the 
earliest xmin that you can still flashback too. Vacuum would advance

that xmin, as soon as it starts working. So the case you'd need to
protect against would be a race condition when you start a vacuum
and a flashback transaction at the same time. But for that, some simple
semaphore should suffice, and a well-thought-out ordering of the actions
taken.

In the long run, you'd probably want to store the commit-times of 
transactions somewhere, and add some guc that makes a vacuum assume

that recently comitted transaction (say, in the last hour) are still
considered active. That allow the dba to guarantee that he can always
flashback at least a hour.

greetings, Florian Pflug

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Gregory Stark

Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:

 First we must run the query in serializable mode and replace 
 the snapshot with a synthetic one, which defines visibility 
 at the start of the desired transaction

 We could use something that controls global xmin.
 It would ensure, that global xmin does not advance bejond
 what still needs to be visible. This would probably be a 
 sliding time window, or a fixed point in time that is
 released by the dba/user.

Well there's another detail you have to cover aside from rolling back your
xmin. You have to find the rest of the snapshot including knowing what other
transactions were in-progress at the time you want to flash back to.

If you just roll back xmin and set xmax to the same value you'll get a
consistent view of the database but it may not match a view that was ever
current. That is, some of the transactions after the target xmin may have
committed before that xmin. So there was never a time in the database when
they were invisible but your new xmin was visible.

I think to do this you'll need to periodically record a snapshot and then
later restore one of those saved snapshots. Not sure where would be a good
place to record them. The WAL seems like a handy place but digging through the
WAL would be annoying.

Incidentally this is one of the things that would be useful for read-only
access to PITR warm standby machines.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, Feb 19, 2007 at 04:00:09PM +0100, Florian G. Pflug wrote:
[...]
 In the long run, you'd probably want to store the commit-times of 
 transactions somewhere, and add some guc that makes a vacuum assume
 that recently comitted transaction (say, in the last hour) are still
 considered active [...]

Funny how some things recur:

  http://archives.postgresql.org/pgsql-hackers/2007-01/msg01301.php

(says I and seeks shelter beneath a big rock ;-)

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFF2c3RBcgs9XrR2kYRAh1PAJ442IXzr0CjN0w5a3BpwBrKgVGvsgCcCmyh
mnM5AUTHo4uIZ/WCnWxLVM0=
=1aUG
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread August Zajonc
Gregory Stark wrote:
 Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
 
 First we must run the query in serializable mode and replace 
 the snapshot with a synthetic one, which defines visibility 
 at the start of the desired transaction
 We could use something that controls global xmin.
 It would ensure, that global xmin does not advance bejond
 what still needs to be visible. This would probably be a 
 sliding time window, or a fixed point in time that is
 released by the dba/user.
 
 Well there's another detail you have to cover aside from rolling back your
 xmin. You have to find the rest of the snapshot including knowing what other
 transactions were in-progress at the time you want to flash back to.
 
 If you just roll back xmin and set xmax to the same value you'll get a
 consistent view of the database but it may not match a view that was ever
 current. That is, some of the transactions after the target xmin may have
 committed before that xmin. So there was never a time in the database when
 they were invisible but your new xmin was visible.
 
[...]
 Incidentally this is one of the things that would be useful for read-only
 access to PITR warm standby machines.
 

Couldn't you define things simply to be that you get a consistent view
including all transactions started before x transaction? This is time
travel lite, but low overhead which I think is a key benefit of this
approach.

A huge value for this would be in the oops, I deleted my data category.
Postgresql rarely looses data, but clients seem to have a habit of doing
so, and then going oops. This seems to happen most often when facing
something like a reporting deadline where they are moving lots of stuff
around and making copies and sometimes delete the wrong company
recordset or equivalent, even with confirmation dialogs at the app level.

This would give a quick and easy oops procedure to the client. DBA set's
guc to 1hr, tells client, if you make a big mistake, stop database
server as follows and call. Frankly, would bail a few DBA's out as well.

The key is how lightweight the setup could be, which matters because
clients are not always willing to pay for a PITR setup. The low overhead
would mean you'd feel fine about setting guc to 1hr or so.

As a % of total installed instances I suspect the % with PITR is small.
I've got stuff I snapshot nightly, but that's it. So don't have an easy
out from the oops query either.

- August






---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Florian G. Pflug

August Zajonc wrote:

Gregory Stark wrote:

Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:

First we must run the query in serializable mode and replace 
the snapshot with a synthetic one, which defines visibility 
at the start of the desired transaction

We could use something that controls global xmin.
It would ensure, that global xmin does not advance bejond
what still needs to be visible. This would probably be a 
sliding time window, or a fixed point in time that is

released by the dba/user.

Well there's another detail you have to cover aside from rolling back your
xmin. You have to find the rest of the snapshot including knowing what other
transactions were in-progress at the time you want to flash back to.

If you just roll back xmin and set xmax to the same value you'll get a
consistent view of the database but it may not match a view that was ever
current. That is, some of the transactions after the target xmin may have
committed before that xmin. So there was never a time in the database when
they were invisible but your new xmin was visible.

[...]
Incidentally this is one of the things that would be useful for read-only
access to PITR warm standby machines.



Couldn't you define things simply to be that you get a consistent view
including all transactions started before x transaction? This is time
travel lite, but low overhead which I think is a key benefit of this
approach.


I was thinking along the same line. Flashback is probably ony really
usefull on databases that are mostly read-only, but with a few users
who update data. You'd use flashback to undo catastrophic changes done
by accident, and probably will gladly accept that you undo a little
more work than strictly necessary.

On the contrary, if you're running a online shop were people buy stuff
24/7, and, say, somebody accidentally deletes some producs, than you
won't want to loose the orders happened during that last hour, but will
rather try to regenerate that products from your last backup.

So I don't think that it's too important what snapshot you get exactly,
making the xmin=xmax idea feasable.

The same holds true for PITR warm standby (readonly queries on pitr 
slaves). This would be used for reporting, or load-balancing of searches
in fairly static data - all of which won't depend on the exact snapshot 
you get.


greetings, Florian Pflug


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread August Zajonc

On Mon, 19 Feb 2007 20:30:59 +0100, Florian G. Pflug [EMAIL PROTECTED]
said:
 August Zajonc wrote:
  Gregory Stark wrote:
  
  Couldn't you define things simply to be that you get a consistent view
  including all transactions started before x transaction? This is time
  travel lite, but low overhead which I think is a key benefit of this
  approach.
 
 I was thinking along the same line. Flashback is probably ony really
 usefull on databases that are mostly read-only, but with a few users
 who update data. You'd use flashback to undo catastrophic changes done
 by accident, and probably will gladly accept that you undo a little
 more work than strictly necessary.
 
 On the contrary, if you're running a online shop were people buy stuff
 24/7, and, say, somebody accidentally deletes some producs, than you
 won't want to loose the orders happened during that last hour, but will
 rather try to regenerate that products from your last backup.

Hopefully people doing order systems are using PITR or similar :) 

For the time travel light case, it's just a matter of clear definition.
You get all transactions that were *started* before and up to x trx. If
the transaction rolled back you still won't see it, so you're still
getting a consistent view. But if it committed after your marker you
will see it. That seems ok to me. In fact, I suspect folks think of
transactions as happening more or less when they get sent to the DB, so
this may map more directly to what people expect.

The one caveat would be that if you started a long running transaction,
then did the oops trx 5 minutes later, and then started time travel
*before* the long running trx committed. In that case you wouldn't see
that long running trx, so the definition would need to be modified to be
something like all trx started before x, that were no longer running
when you time travel. Don't know if it is worth a NOTICE in the logs if
you time travel back, but there are id's of transactions from before
your xmin that are still running (and if you waited a bit might become
visable in your time travel view). 

If Jan gets his way with a timestamp on trx commit, then you can do
started before x time, which may be more user friendly. 

For PITR I'd imagine you might actually be able to get the visability
right no? Havn't looked deeply enough into the wal logs to understand
how the partial playback scanario works. If the wal logs are ordered on
trx commit time, then you'd get proper visability. 

- August

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Tom Lane
August Zajonc [EMAIL PROTECTED] writes:
 The key is how lightweight the setup could be, which matters because
 clients are not always willing to pay for a PITR setup. The low overhead
 would mean you'd feel fine about setting guc to 1hr or so.

This would have exactly the same performance consequences as always
having an hour-old open transaction.  I'm afraid that describing it
as low overhead is mere wishful thinking: it would cripple vacuuming
of high-update tables and greatly increase the typical load on pg_clog
and pg_subtrans.  We already know that pg_subtrans contention can be a
source of context-swap storms, with the size of the window back to
GlobalXmin being the controlling factor for how bad it gets.

It's possible that this last could be addressed by separating the
concept of old enough to be vacuumed from GlobalXmin, but it's
certainly not a trivial thing.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New feature request: FlashBack Query

2007-02-19 Thread Jonah H. Harris

On 2/17/07, Joshua D. Drake [EMAIL PROTECTED] wrote:

My understanding is that the main difference is that rollbacks are
inexpensive for us, but expensive for Oracle.


Yes, Oracle is optimized for COMMIT, we're optimized for ROLLBACK :)

In all seriousness, last time I checked Oracle's MVCC was covered by
two patents.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New feature request: FlashBack Query

2007-02-18 Thread Hannu Krosing
Ühel kenal päeval, L, 2007-02-17 kell 22:49, kirjutas Chad Wagner:

 
 
 However, they don't have vacuum, we do.
 
 Right, and I think that is more or less because Oracle doesn't need
 it.  Vacuum's main purpose (correct me if I am wrong) is to
 recover/mark rows that are no longer used, and Oracle essentially
 reuses the space immediately. 
 
 Obviously with Oracle if you bloat out a table and delete a ton of
 rows then you have to rebuild the table, but that is more or less the
 same problem that PostgreSQL has and where vacuum full comes into
 play.
 
 The only benefit with the Oracle model is that you can achieve
 flashback, which is a very rarely used feature in my book.

We can have flashbacks up to the last vacuum. It is just not exposed.
Don't vacuum, and you have the whole history. (Actually you can't go for
more than 2G transactions, or you get trx id rollover).

To get a flashback query, you just have to construct a snapshot from
that time and you are done. We don't store transaction times anywere, so
the flashback has to be by transaction id, but there is very little
extra work involved. We just don't have syntax for saying SELECT ... AS
SEEN BY TRANSACTION XXX

AFAIK, Oracles flashbacks also can go as far back as there are rollback
segments.

Postgres' original design prescribed, that VACUUM would not delete dead
tuples, but just move them to history tables on cheap(er) WORM storage.
Doing that would have very little overhead (except writing the old
tuples) and would not need any fundamental changes to how we do things
currently.

   The disadvantages is likely overhead to perform the rollback and
 possibly more scattered reads.  

I've also heard reports, that doing concurrent data loading and big
analysis queries is a royal pain in Oracle.

 I can say that I have used it, and it has come in handy, but hardly
 worth it.  The benefit with the PostgreSQL model is the likelihood of
 the old rows being inline with the rest of the table data, potentially
 reducing scattered reads.  The disadvantage is vacuuming, it seems to
 be often overlooked -- possibly solved by defaulting autovacuum to on?
 (seems to be the way Oracle is heading, defaulting statistics
 collection to on and other management features). 
 
-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] New feature request: FlashBack Query

2007-02-18 Thread Joshua D. Drake
Hannu Krosing wrote:
 Ühel kenal päeval, L, 2007-02-17 kell 22:49, kirjutas Chad Wagner:
 

 However, they don't have vacuum, we do.

 Right, and I think that is more or less because Oracle doesn't need
 it.  Vacuum's main purpose (correct me if I am wrong) is to
 recover/mark rows that are no longer used, and Oracle essentially
 reuses the space immediately. 

 Obviously with Oracle if you bloat out a table and delete a ton of
 rows then you have to rebuild the table, but that is more or less the
 same problem that PostgreSQL has and where vacuum full comes into
 play.

 The only benefit with the Oracle model is that you can achieve
 flashback, which is a very rarely used feature in my book.
 
 We can have flashbacks up to the last vacuum. It is just not exposed.
 Don't vacuum, and you have the whole history. (Actually you can't go for
 more than 2G transactions, or you get trx id rollover).
 
 To get a flashback query, you just have to construct a snapshot from
 that time and you are done. We don't store transaction times anywere, so
 the flashback has to be by transaction id, but there is very little
 extra work involved. We just don't have syntax for saying SELECT ... AS
 SEEN BY TRANSACTION XXX

Well this is certainly interesting. What do we think it would take to
enable the functionality?

Joshua D. Drake



-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] New feature request: FlashBack Query

2007-02-17 Thread RPK

PostgreSQL, already a mature database, needs to have more options for
recovery as compared to proprietary databases. I just worked with Oracle's
FlashBack query feature in Oracle 9i and FlashBack Table feature in 10g.

Future versions of PostgreSQL must have similar features which enable users
to bring Table(s) and/or Database(s) to a desired Time Stamp.
-- 
View this message in context: 
http://www.nabble.com/New-feature-request%3A-FlashBack-Query-tf3245023.html#a9020502
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-17 Thread Joshua D. Drake
RPK wrote:
 PostgreSQL, already a mature database, needs to have more options for
 recovery as compared to proprietary databases. I just worked with Oracle's
 FlashBack query feature in Oracle 9i and FlashBack Table feature in 10g.
 
 Future versions of PostgreSQL must have similar features which enable users
 to bring Table(s) and/or Database(s) to a desired Time Stamp.

We can do it with databases, we can't do it with tables. Nor should we
do it with tables as it would require that all tables in relation are
also flashed backed.

Joshua D. Drake

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-17 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
 RPK wrote:
 Future versions of PostgreSQL must have similar features which enable users
 to bring Table(s) and/or Database(s) to a desired Time Stamp.

 We can do it with databases, we can't do it with tables. Nor should we
 do it with tables as it would require that all tables in relation are
 also flashed backed.

AFAICT this is a request to re-instate Time Travel, which is a feature
we removed more than ten years ago because the overhead was utterly
unacceptable.  And the project's idea of acceptable performance then
was orders of magnitude weaker than it is now.  So it's not going to
happen, at least not in the general release.  You might take a look at
contrib/spi/README.timetravel, though, for a prototype of how something
similar can be achieved without any changes to the core system.  That
module is a bit unmaintained and could doubtless do with some updates
--- for starters, it should be using timestamptz instead of the old
deprecated abstime.  If you're interested, feel free to work on it.
No one else has taken an interest in a long time.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New feature request: FlashBack Query

2007-02-17 Thread elein
On Sat, Feb 17, 2007 at 11:48:55AM -0500, Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
  RPK wrote:
  Future versions of PostgreSQL must have similar features which enable users
  to bring Table(s) and/or Database(s) to a desired Time Stamp.
 
  We can do it with databases, we can't do it with tables. Nor should we
  do it with tables as it would require that all tables in relation are
  also flashed backed.
 
 AFAICT this is a request to re-instate Time Travel, which is a feature
 we removed more than ten years ago because the overhead was utterly
 unacceptable.  And the project's idea of acceptable performance then
 was orders of magnitude weaker than it is now.  So it's not going to
 happen, at least not in the general release.  You might take a look at
 contrib/spi/README.timetravel, though, for a prototype of how something
 similar can be achieved without any changes to the core system.  That
 module is a bit unmaintained and could doubtless do with some updates
 --- for starters, it should be using timestamptz instead of the old
 deprecated abstime.  If you're interested, feel free to work on it.
 No one else has taken an interest in a long time.
 

For other recent time travel ideas see: 
http://www.varlena.com/GeneralBits/122.php
Time travel is not cheap, though.

--elein
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-17 Thread Chad Wagner

On 2/17/07, elein [EMAIL PROTECTED] wrote:


For other recent time travel ideas see:
http://www.varlena.com/GeneralBits/122.php
Time travel is not cheap, though.




I am sure this topic has probably been beaten to death in the past, but has
anyone talked about the advantages of Oracle's MVCC model versus
PostgreSQL's MVCC model?  Oracle achieves multiversioning by using
rollback/undo segments, where PostgreSQL appears to place (essentially) the
undo in the same space as the table.

If I were to guess this is probably a major thing to change.  Clearly there
are advantages to both, with Oracle essentially the space consumed by a
modified row is immediately available for reuse and generally there is
little row migration assuming there is enough space on the block so you
should be able to avoid updates to the index and the bloating that seems to
go along with vacuuming.

Is there any previous discussions that folks could point out here?


Re: [HACKERS] New feature request: FlashBack Query

2007-02-17 Thread Joshua D. Drake
Chad Wagner wrote:
 On 2/17/07, elein [EMAIL PROTECTED] wrote:

 For other recent time travel ideas see:
 http://www.varlena.com/GeneralBits/122.php
 Time travel is not cheap, though.

 
 
 I am sure this topic has probably been beaten to death in the past, but has
 anyone talked about the advantages of Oracle's MVCC model versus
 PostgreSQL's MVCC model?  Oracle achieves multiversioning by using
 rollback/undo segments, where PostgreSQL appears to place (essentially) the
 undo in the same space as the table.

My understanding is that the main difference is that rollbacks are
inexpensive for us, but expensive for Oracle. Talk to an Oracle DBA
about their Rollback logs :0.

However, they don't have vacuum, we do.

Joshua D. Drake

 
 If I were to guess this is probably a major thing to change.  Clearly there
 are advantages to both, with Oracle essentially the space consumed by a
 modified row is immediately available for reuse and generally there is
 little row migration assuming there is enough space on the block so you
 should be able to avoid updates to the index and the bloating that seems to
 go along with vacuuming.
 
 Is there any previous discussions that folks could point out here?
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New feature request: FlashBack Query

2007-02-17 Thread Chad Wagner

On 2/17/07, Joshua D. Drake [EMAIL PROTECTED] wrote:


My understanding is that the main difference is that rollbacks are
inexpensive for us, but expensive for Oracle. Talk to an Oracle DBA
about their Rollback logs :0.



Yes, I have seen cases where undo segments are thrashed.  Generally it works
well, and I agree it likely much cheaper on PostgreSQL as you would expect
fewer scattered reads because the old version is inline with the rest of the
data.

But if I recall undo segments are cached in Oracle, usually where I see
problems is where the DBA is completely incompetent and has undersized the
buffer cache.  Oracle does direct reads (afaik) -- so undersizing the buffer
cache can be brutal.  A very common mistake with Oracle was undersizing the
buffer cache and oversizing the shared pool (when the shared plans are
stored), and with 9i and later they tried to have the management tools
suggest the ideal values or have it automatically managed by the database.
Probably a step in the right direction, but I know they still have bumps to
iron it.  :)


However, they don't have vacuum, we do.




Right, and I think that is more or less because Oracle doesn't need it.
Vacuum's main purpose (correct me if I am wrong) is to recover/mark rows
that are no longer used, and Oracle essentially reuses the space
immediately.

Obviously with Oracle if you bloat out a table and delete a ton of rows then
you have to rebuild the table, but that is more or less the same problem
that PostgreSQL has and where vacuum full comes into play.

The only benefit with the Oracle model is that you can achieve flashback,
which is a very rarely used feature in my book.  The disadvantages is likely
overhead to perform the rollback and possibly more scattered reads.  I can
say that I have used it, and it has come in handy, but hardly worth it.  The
benefit with the PostgreSQL model is the likelihood of the old rows being
inline with the rest of the table data, potentially reducing scattered
reads.  The disadvantage is vacuuming, it seems to be often overlooked --
possibly solved by defaulting autovacuum to on? (seems to be the way Oracle
is heading, defaulting statistics collection to on and other management
features).


Re: [HACKERS] New feature request: FlashBack Query

2007-02-17 Thread Tom Lane
Chad Wagner [EMAIL PROTECTED] writes:
 I am sure this topic has probably been beaten to death in the past, but has
 anyone talked about the advantages of Oracle's MVCC model versus
 PostgreSQL's MVCC model?

Yes, we've been all through that.  We like ours.  See the archives.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New feature request: FlashBack Query

2007-02-17 Thread Warren Turkal
On Saturday 17 February 2007 07:49, RPK wrote:
 PostgreSQL, already a mature database, needs to have more options for
 recovery as compared to proprietary databases. I just worked with Oracle's
 FlashBack query feature in Oracle 9i and FlashBack Table feature in 10g.

 Future versions of PostgreSQL must have similar features which enable users
 to bring Table(s) and/or Database(s) to a desired Time Stamp.

Check out my proposal[1] for Temporal extensions. Ultimately, creating valid 
time and transaction time tables would be possible through my proposal. Please 
check it out.

[1]http://archives.postgresql.org/pgsql-hackers/2007-02/msg00540.php

wt
-- 
Warren Turkal (w00t)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match