Re: [HACKERS] start of transaction (was: Re: [PERFORM] Help with count(*))

2003-11-18 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Hannu Krosing wrote:
 It would be even better to have now() that returns the time current
 transaction is COMMITted as this is the time other backend become aware
 of it ;)

 True, but implementing that would be very hard.

Son, that was a *joke* ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] start of transaction (was: Re: [PERFORM] Help with count(*))

2003-11-17 Thread Bruce Momjian
Hannu Krosing wrote:
 Bruce Momjian kirjutas E, 17.11.2003 kell 02:31:
 
  Defining now() as the first call seems pretty arbitrary to me.  I can't
  think of any time-based interface that has that API.  And what if a
  trigger called now() in an earlier query and you didn't even know about
  it.
 
 That would be OK. The whole point of that previous discussion was to
 have now() that returns the same value over the span of the whole
 transaction.

I think my issue is that there isn't any predictable way for a user to
know when the now() time is recorded.  By using start of transaction, at
least we know for sure the point in time it is showing.

 It would be even better to have now() that returns the time current
 transaction is COMMITted as this is the time other backend become aware
 of it ;)

True, but implementing that would be very hard.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[HACKERS] start of transaction (was: Re: [PERFORM] Help with count(*))

2003-11-16 Thread Hannu Krosing
Redirected to -hackers

Neil Conway kirjutas L, 15.11.2003 kell 22:20:
 Tom Lane [EMAIL PROTECTED] writes:
  (I believe the previous discussion also agreed that we wanted to
  postpone the freezing of now(), which currently also happens at
  BEGIN rather than the first command after BEGIN.)
 
 That doesn't make sense to me: from a user's perspective, the start
 of the transaction is when the BEGIN is issued, regardless of any
 tricks we may play in the backend.

For me, the start of transaction is not about time, but about grouping
a set of statements into one. So making the exact moment of start be
the first statement that actually does something with data seems
perfectly reasonable. If you really need to preserve time, do select
current_timestamp and use the result.

 Making now() return the time the current transaction started is
 reasonably logical; making now() return the time when the first
 command after the BEGIN in the current transaction was issued makes a
 lot less sense to me.

for me the time the current transactuion is started == the time when
the first command after the BEGIN in the current transaction was issued
and thus I see no conflict here ;)

Delaying the locking effects of transactions as long as possible can
increase performance overall, not just for pathological clients that sit
on idle open transactions.

Probably the latest time we can start the transaction is ath the start
of executor step after the first statement in a transaction is planned
and optimized.

---
Hannu



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] start of transaction (was: Re: [PERFORM] Help with count(*))

2003-11-16 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes:
 Probably the latest time we can start the transaction is ath the start
 of executor step after the first statement in a transaction is planned
 and optimized.

The transaction has to exist before it can take locks, so the above
would not fly.

A complete example of what we have to think about is:

BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
LOCK TABLE foo;
UPDATE foo ...  -- or in general a SELECT/UPDATE/INSERT/DELETE query
... etc ...

The transaction snapshot *must* be set at the time of the first query
(here, the UPDATE).  It obviously can't be later, and it cannot be
earlier either, because in this sort of example you need the requested
locks to be taken before the snapshot is set.

The transaction must be created (as observed by other backends, in
particular VACUUM) not later than the LOCK statement, else there is
nothing that can own the lock.  In principle though, the effects of
BEGIN and perhaps SET could be strictly local to the current backend,
and only when we hit a LOCK or query do we create the transaction
externally.

In practice the problem we observe is clients that issue BEGIN and then
go to sleep (typically because of poorly-designed autocommit behavior in
interface libraries).  Postponing externally-visible creation of the
transaction to the first command after BEGIN would be enough to get
around the real-world issues, and it would not require code changes
nearly as extensive as trying to let other stuff like SET happen
before the transaction starts.

There isn't any compelling implementation reason when to freeze the
value of now().  Reasonable options are
1. at BEGIN (current behavior)
2. at transaction's external creation 
3. at freezing of transaction snapshot
#1 and #2 are actually the same at the moment, but could be decoupled
as sketched above, in which case the behavior of #2 would effectively
become at first command afte BEGIN.

In the previous thread:
http://archives.postgresql.org/pgsql-hackers/2003-03/msg01178.php
I argued that now() should be frozen at the time of the transaction
snapshot, and I still think that that's a defensible behavior.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] start of transaction (was: Re: [PERFORM] Help with count(*))

2003-11-16 Thread Neil Conway
Hannu Krosing [EMAIL PROTECTED] writes:
 For me, the start of transaction is not about time, but about grouping
 a set of statements into one. So making the exact moment of start be
 the first statement that actually does something with data seems
 perfectly reasonable.

This might be a perfectly logical change in semantics, but what
benefit does it provide over the old way of doing things?

What does BEGIN actually do now, from a user's perspective? At
present, it starts a transaction block, which is pretty simple. If
we adopted the proposed change, it would change the state of the
system so that the next command is part of a new transaction. This is
naturally more complex; but more importantly, what benefit does it
ACTUALLY provide to the user?

(I can't see one, but perhaps I'm missing something...)

 Delaying the locking effects of transactions as long as possible can
 increase performance overall, not just for pathological clients that sit
 on idle open transactions.

I agree, but this is irrelevant to the semantics of now().

-Neil


---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] start of transaction (was: Re: [PERFORM] Help with count(*))

2003-11-16 Thread Neil Conway
Tom Lane [EMAIL PROTECTED] writes:
 That's defensible when the user issued the BEGIN himself.  When the
 BEGIN is coming from some interface library's autocommit logic, it's
 a lot less defensible.  If you consult the archives, you will find
 actual user complaints about why is now() returning a very old time?
 that we traced to use of interface layers that handle commit() by
 issuing COMMIT; BEGIN;.

Hmmm... I agree this behavior isn't ideal, although I can see the case
for viewing this as a mistake by the application developer: they are
assuming that they know exactly when transactions begin, which is not
a feature provided by their language interface. They should be using
current_timestamp, and/or changing their language interface's
configuration.

That said, I think this is a minor irritation at best. The dual
drawbacks of breaking backward compatibility and making the BEGIN
semantics more confusing is enough to leave me satisfies with the
status quo.

If we do change this, I think Dennis' idea of making now() always
return the same value within a given transaction is interesting: that
might be a way to fix this problem without confusing the semantics of
BEGIN.

-Neil


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] start of transaction (was: Re: [PERFORM] Help with count(*))

2003-11-16 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 Hmmm... I agree this behavior isn't ideal, although I can see the case
 for viewing this as a mistake by the application developer: they are
 assuming that they know exactly when transactions begin, which is not
 a feature provided by their language interface.

Well, actually, it's a bug in the interface IMHO.  But as I said in the
last thread, it's a fairly widespread bug.  We've been taking the
position that the interface libraries should get fixed, and that's not
happening.  It's probably time to look at a server-side fix.

 If we do change this, I think Dennis' idea of making now() always
 return the same value within a given transaction is interesting:

You mean the time of the first now() call?  I thought that was an
interesting idea also, but it's probably not going to look so hot
when we complete the TODO item of adding access to
the start-of-current-statement time.  Having start-of-transaction be
later than start-of-statement isn't gonna fly :-(.  If we were willing
to abandon that TODO item then I'd be interested in defining now() as
Dennis suggested.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] start of transaction (was: Re: [PERFORM] Help with count(*))

2003-11-16 Thread Bruce Momjian
Tom Lane wrote:
 Neil Conway [EMAIL PROTECTED] writes:
  Hmmm... I agree this behavior isn't ideal, although I can see the case
  for viewing this as a mistake by the application developer: they are
  assuming that they know exactly when transactions begin, which is not
  a feature provided by their language interface.
 
 Well, actually, it's a bug in the interface IMHO.  But as I said in the
 last thread, it's a fairly widespread bug.  We've been taking the
 position that the interface libraries should get fixed, and that's not
 happening.  It's probably time to look at a server-side fix.
 
  If we do change this, I think Dennis' idea of making now() always
  return the same value within a given transaction is interesting:
 
 You mean the time of the first now() call?  I thought that was an
 interesting idea also, but it's probably not going to look so hot
 when we complete the TODO item of adding access to
 the start-of-current-statement time.  Having start-of-transaction be
 later than start-of-statement isn't gonna fly :-(.  If we were willing
 to abandon that TODO item then I'd be interested in defining now() as
 Dennis suggested.

Defining now() as the first call seems pretty arbitrary to me.  I can't
think of any time-based interface that has that API.  And what if a
trigger called now() in an earlier query and you didn't even know about
it.


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] start of transaction (was: Re: [PERFORM] Help with count(*))

2003-11-16 Thread Greg Stark

Neil Conway [EMAIL PROTECTED] writes:

 What does BEGIN actually do now, from a user's perspective? 

I think you're thinking about this all wrong. BEGIN doesn't do anything.
It's not a procedural statement, it's a declaration. It declares that the
block of statements form a transaction so reads should be consistent and
failures should be handled in a particular way to preserve data integrity.

Given that declaration and the guarantees it requires of the database it's
then up to the database to figure out what constraints that imposes on what
the database can do and still meet the guarantees the BEGIN declaration
requires. The more clever the database is about minimizing those restrictions
the better as it means the database can run more efficiently.

For what it's worth, this is how Oracle handles things too. On the
command-line issuing a BEGIN following a COMMIT is just noise; you're _always_
in a transaction. A COMMIT ends the previous the transaction and implicitly
starts the next transaction. But the snapshot isn't frozen until you first
read from a table.

I'm not sure what other databases do, but I think this is why clients behave
like this. They think of BEGIN as a declaration and therefore initiating a
COMMIT;BEGIN; at the end of every request is perfectly logical, and works fine
in at least Oracle, and probably other databases.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match