On Tue, 2008-12-30 at 18:31 +0200, Heikki Linnakangas wrote:
> You have to be careful to ignore the flags in read-only transactions
> that started in hot standby mode, even if recovery has since ended and
> we're in normal operation now.
My initial implementation in v6 worked, but had a corner
On Tue, 2008-12-30 at 18:31 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > (a) always ignore LP_DEAD flags we see when reading index during
> > recovery.
>
> This sounds simplest, and it's nice to not clear the flags for the
> benefit of transactions running after the recovery is done
Simon Riggs wrote:
(a) always ignore LP_DEAD flags we see when reading index during
recovery.
This sounds simplest, and it's nice to not clear the flags for the
benefit of transactions running after the recovery is done.
You have to be careful to ignore the flags in read-only transactions
t
Simon Riggs wrote:
Issues (2) and (3) would go away entirely if both standby and primary
always had the same xmin value as a system-wide setting. i.e. the
standby and primary are locked together at their xmins. Perhaps that was
Heikki's intention in recent suggestions?
No, I only suggested tha
On Fri, 2008-12-19 at 09:22 -0500, Greg Stark wrote:
> I'm confused shouldn't read-only transactions on the slave just be
> hacked to not set any hint bits including lp_delete?
It seems there are multiple issues involved and I saw only the first of
these initially. I want to explicitly separat
marcin mank wrote:
Perhaps we should listen to the people that have said they don't want
queries cancelled, even if the alternative is inconsistent answers.
I don't like that much. PostgreSQL has traditionally avoided that very
hard. It's hard to tell what kind of inconsistencies you'd get, as
> Perhaps we should listen to the people that have said they don't want
> queries cancelled, even if the alternative is inconsistent answers.
I think an alternative to that would be "if the wal backlog is too
big, let current queries finish and let incoming queries wait till the
backlog gets small
On Wed, Dec 24, 2008 at 7:18 PM, Simon Riggs wrote:
>
>
>
> With respect, I was hoping you might look in the patch and see if you
> agree with the way it is handled. No need to remember. The whole
> latestRemovedXid concept is designed to do help.
>
Well, that's common for all cleanup record incl
On Wed, 2008-12-24 at 09:59 -0500, Robert Treat wrote:
> I think the uncertainty comes from peoples experience with typical
> replication
> use cases vs a lack of experience with this current implementation.
Quite possibly.
Publishing user feedback on this will be very important in making t
On Wednesday 24 December 2008 08:48:04 Simon Riggs wrote:
> On Wed, 2008-12-24 at 17:56 +0530, Pavan Deolasee wrote:
> > Again, I haven't seen how frequently queries may get canceled. Or if
> > the delay is set to a large value, how far behind standby may get
> > during replication, so I can't real
On Wed, 2008-12-24 at 17:56 +0530, Pavan Deolasee wrote:
> On Wed, Dec 24, 2008 at 5:26 PM, Simon Riggs wrote:
> >
>
> >
> > The patch does go to some trouble to handle that case, as I'm sure
> > you've seen. Are you saying that part of the patch is ineffective and
> > should be removed, or?
> >
On Wed, Dec 24, 2008 at 5:26 PM, Simon Riggs wrote:
>
>
> The patch does go to some trouble to handle that case, as I'm sure
> you've seen. Are you saying that part of the patch is ineffective and
> should be removed, or?
>
Umm.. are you talking about the "wait" mechanism ? That's the only
thing
On Wed, 2008-12-24 at 16:48 +0530, Pavan Deolasee wrote:
> On Wed, Dec 24, 2008 at 4:41 PM, Simon Riggs wrote:
> >
> >
> > Greg and Heikki have highlighted in this thread some aspects of btree
> > garbage collection that will increase the chance of queries being
> > cancelled in various circumsta
On Wed, Dec 24, 2008 at 4:41 PM, Simon Riggs wrote:
>
>
> Greg and Heikki have highlighted in this thread some aspects of btree
> garbage collection that will increase the chance of queries being
> cancelled in various circumstances
Even HOT-prune may lead to frequent query cancellations and unli
On Tue, 2008-12-23 at 23:59 -0500, Robert Treat wrote:
> On Friday 19 December 2008 19:36:42 Simon Riggs wrote:
> > Perhaps we should listen to the people that have said they don't want
> > queries cancelled, even if the alternative is inconsistent answers. That
> > is easily possible yet is not c
On Friday 19 December 2008 19:36:42 Simon Riggs wrote:
> Perhaps we should listen to the people that have said they don't want
> queries cancelled, even if the alternative is inconsistent answers. That
> is easily possible yet is not currently an option. Plus we have the
> option I referred to up t
On Saturday 20 December 2008 04:10:21 Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Sat, 2008-12-20 at 09:21 +0200, Heikki Linnakangas wrote:
> >> Gregory Stark wrote:
> >>> Simon Riggs writes:
> Increasing the waiting time increases the failover time and thus
> decreases the val
On Fri, 2008-12-19 at 14:23 -0600, Kevin Grittner wrote:
> > I guess making it that SQLSTATE would make it simpler to understand
> why
> > the error occurs and also how to handle it (i.e. resubmit).
>
> Precisely.
Just confirming I will implement the SQLSTATE as requested.
I recognize my own
On Sat, 2008-12-20 at 20:09 -0300, Alvaro Herrera wrote:
> Heikki Linnakangas wrote:
> > Gregory Stark wrote:
> >> A vacuum being replayed -- even in a different database -- could trigger
> >> the
> >> error. Or with the btree split issue, a data load -- again even in a
> >> different
> >> databa
On Sat, 2008-12-20 at 22:07 +0200, Heikki Linnakangas wrote:
> Gregory Stark wrote:
> > A vacuum being replayed -- even in a different database -- could trigger the
> > error. Or with the btree split issue, a data load -- again even in a
> > different
> > database -- would be quite likely cause yo
Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
>> Heikki Linnakangas wrote:
>>> Gregory Stark wrote:
A vacuum being replayed -- even in a different database -- could trigger
the
error. Or with the btree split issue, a data load -- again even in a
different
database --
Alvaro Herrera wrote:
Heikki Linnakangas wrote:
Gregory Stark wrote:
A vacuum being replayed -- even in a different database -- could trigger the
error. Or with the btree split issue, a data load -- again even in a different
database -- would be quite likely cause your SELECT to be killed.
Hmm,
Heikki Linnakangas wrote:
> Gregory Stark wrote:
>> A vacuum being replayed -- even in a different database -- could trigger the
>> error. Or with the btree split issue, a data load -- again even in a
>> different
>> database -- would be quite likely cause your SELECT to be killed.
>
> Hmm, I wond
Gregory Stark wrote:
A vacuum being replayed -- even in a different database -- could trigger the
error. Or with the btree split issue, a data load -- again even in a different
database -- would be quite likely cause your SELECT to be killed.
Hmm, I wonder if we should/could track the "latestRe
Simon Riggs wrote:
On Sat, 2008-12-20 at 09:21 +0200, Heikki Linnakangas wrote:
Gregory Stark wrote:
Simon Riggs writes:
Increasing the waiting time increases the failover time and thus
decreases the value of the standby as an HA system. Others value high
availability higher than you and so
On Sat, 2008-12-20 at 09:21 +0200, Heikki Linnakangas wrote:
> Gregory Stark wrote:
> > Simon Riggs writes:
> >
> >> Increasing the waiting time increases the failover time and thus
> >> decreases the value of the standby as an HA system. Others value high
> >> availability higher than you and s
Heikki Linnakangas wrote:
Gregory Stark wrote:
The question I had was whether your solution for btree pointers marked
dead
and later dropped from the index works when the user hasn't configured a
timeout and doesn't want standby queries killed.
Yes, it's not any different from vacuum WAL reco
Gregory Stark wrote:
Simon Riggs writes:
Increasing the waiting time increases the failover time and thus
decreases the value of the standby as an HA system. Others value high
availability higher than you and so we had agreed to provide an option
to allow the max waiting time to be set.
Sure
On Fri, 2008-12-19 at 19:29 -0500, Robert Treat wrote:
> On Friday 19 December 2008 05:52:42 Simon Riggs wrote:
> > BTW, I noticed the other day that Oracle 11g only allows you to have a
> > read only slave *or* allows you to continue replaying. You need to
> > manually switch back and forth betwe
On Fri, 2008-12-19 at 20:54 +, Gregory Stark wrote:
> "Kevin Grittner" writes:
>
> > PostgreSQL is much less prone to serialization failures, but it is
> > certainly understandable if hot standby replication introduces new
> > cases of it.
>
> In this case it will be possible to get this er
On Friday 19 December 2008 05:52:42 Simon Riggs wrote:
> BTW, I noticed the other day that Oracle 11g only allows you to have a
> read only slave *or* allows you to continue replaying. You need to
> manually switch back and forth between those modes. They can't do
> *both*, as Postgres will be able
>>> Gregory Stark wrote:
> I think the fundamental difference is that a deadlock or
serialization
> failure
> can be predicted as a potential problem when writing the code. This
is
> something that can happen for any query any time, even plain old
read-only
> select queries.
I've heard that
>>> Gregory Stark wrote:
> "Kevin Grittner" writes:
>
>> PostgreSQL is much less prone to serialization failures, but it is
>> certainly understandable if hot standby replication introduces new
>> cases of it.
>
> In this case it will be possible to get this error even if you're
just
> runnin
"Kevin Grittner" writes:
> PostgreSQL is much less prone to serialization failures, but it is
> certainly understandable if hot standby replication introduces new
> cases of it.
In this case it will be possible to get this error even if you're just running
a single SELECT query -- and that's the
>>> Simon Riggs wrote:
> The SQL Standard specifically names this error as thrown when "it
> detects the inability to guarantee the serializability of two or
more
> concurrent SQL-transactions". Now that really should only apply when
> running with SERIALIZABLE transactions,
I disagree. Data
"Kevin Grittner" writes:
Simon Riggs wrote:
>
>> max_standby_delay is set in recovery.conf, value 0 (forever) -
> 2,000,000
>> secs, settable in milliseconds. So think of it like a deadlock
> detector
>> for recovery apply.
>
> Aha! A deadlock is a type of serialization failure. (In
Simon Riggs writes:
> Increasing the waiting time increases the failover time and thus
> decreases the value of the standby as an HA system. Others value high
> availability higher than you and so we had agreed to provide an option
> to allow the max waiting time to be set.
Sure, it's a nice opt
On Fri, 2008-12-19 at 13:47 -0600, Kevin Grittner wrote:
> >>> Simon Riggs wrote:
>
> > max_standby_delay is set in recovery.conf, value 0 (forever) -
> 2,000,000
> > secs, settable in milliseconds. So think of it like a deadlock
> detector
> > for recovery apply.
>
> Aha! A deadlock is a t
>>> Simon Riggs wrote:
> max_standby_delay is set in recovery.conf, value 0 (forever) -
2,000,000
> secs, settable in milliseconds. So think of it like a deadlock
detector
> for recovery apply.
Aha! A deadlock is a type of serialization failure. (In fact, on
databases with lock-based concur
On Fri, 2008-12-19 at 18:59 +, Gregory Stark wrote:
> Simon Riggs writes:
>
> > The error message ought to be "snapshot too old", which could raise a
> > chuckle, so I called it something else.
> >
> > The point you raise is a good one and I think we should publish a list
> > of retryable er
Simon Riggs writes:
> The error message ought to be "snapshot too old", which could raise a
> chuckle, so I called it something else.
>
> The point you raise is a good one and I think we should publish a list
> of retryable error messages. I contemplated once proposing a special log
> level for a
>>> Simon Riggs wrote:
> I understand the need, but we won't be using SQLSTATE = 40001.
>
> That corresponds to ERRCODE_T_R_SERIALIZATION_FAILURE, which that
error
> would not be.
Isn't it a problem with serialization of database transactions? You
hit it in a different way, but if it is a t
On Fri, 2008-12-19 at 10:52 +, Simon Riggs wrote:
> > You could
> > conservatively use OldestXmin as latestRemovedXid, but that could stall
> > the WAL redo a lot more than necessary. Or you could store
> > latestRemovedXid in the page header, but that would need to be
> > WAL-logged to
On Fri, 2008-12-19 at 11:54 -0600, Kevin Grittner wrote:
> >>> Simon Riggs wrote:
>
> > If I was going to add anything to the btree page header, it would be
> > latestRemovedLSN, only set during recovery. That way we don't have
> to
> > explicitly kill queries, we can do the a wait on OldestXm
>>> Simon Riggs wrote:
> If I was going to add anything to the btree page header, it would be
> latestRemovedLSN, only set during recovery. That way we don't have
to
> explicitly kill queries, we can do the a wait on OldestXmin then let
> them ERROR out when they find a page that has been modif
On Fri, 2008-12-19 at 09:22 -0500, Greg Stark wrote:
> I'm confused shouldn't read-only transactions on the slave just be
> hacked to not set any hint bits including lp_delete?
They could be, though I see no value in doing so.
But that is not Heikki's point. He is discussing what happens on
I'm confused shouldn't read-only transactions on the slave just be
hacked to not set any hint bits including lp_delete?
--
Greg
On 19 Dec 2008, at 03:49, Heikki Linnakangas > wrote:
Whenever a B-tree index scan fetches a heap tuple that turns out to
be dead, the B-tree item is marked as k
On Fri, 2008-12-19 at 12:24 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > We have infrastructure in place to make this work correctly, just need
> > to add latestRemovedXid field to xl_btree_vacuum. So that part is easily
> > solved.
>
> That's tricky because there's no xmin/xmax on i
Simon Riggs wrote:
We have infrastructure in place to make this work correctly, just need
to add latestRemovedXid field to xl_btree_vacuum. So that part is easily
solved.
That's tricky because there's no xmin/xmax on index tuples. You could
conservatively use OldestXmin as latestRemovedXid, bu
On Fri, 2008-12-19 at 10:49 +0200, Heikki Linnakangas wrote:
> Whenever a B-tree index scan fetches a heap tuple that turns out to be
> dead, the B-tree item is marked as killed by calling _bt_killitems. When
> the page gets full, all the killed items are removed by calling
> _bt_vacuum_one_pa
50 matches
Mail list logo