[GENERAL] Confusing deadlock report

2016-03-16 Thread Thomas Kellerer
Hello,

we have a strange (at least to me) deadlock situation which does not seem to 
fall into the "usual" deadlock category. 

The error as reported in the Postgres log file is this:

2016-03-12 13:51:29.305 CET [23912]: [1-1] user=arthur,db=prod,app=[unknown] 
ERROR: deadlock detected 
2016-03-12 13:51:29.305 CET [23912]: [2-1] user=arthur,db=prod,app=[unknown] 
DETAIL: Process 23912 waits for ShareLock on transaction; blocked by process 
24342. 
Process 24342 waits for ShareLock on transaction 39632974; blocked by 
process 23912. 
Process 23912: UPDATE alpha SET some_flag = $1 WHERE (id = $2) 
Process 24342: INSERT INTO bravo (..., alpha_id) VALUES ($1, $2, $3, 
$4, $5, $6, $7, $8, $9, $10)

(I have "obfuscated" the table names)


Process 24342 did update table alpha in an earlier step, but a different row 
than Process 23912 updated. 
Table bravo has a foreign key to table alpha.

My understanding of the deadlock report is that the statements shown in the log 
are the actual statements on which the two processes were waiting. 

What I think is unusual in this situation is the INSERT statement that is part 
of the deadlock situation. 

The only way I can think of how a deadlock could happen during an insert, is if 
process 23912 had inserted a row into bravo with the same PK value that process 
24342 is trying to insert. But process 23912 never even touches that table, so 
I am a bit confused on how this can happen. 

Can the foreign key between bravo and alpha play a role here? With some simple 
test setups I could not get the insert to wait even if it was referencing the 
row that the other process has updated. 

This happened on 9.3.10 running on Debian

The only changes I have found regarding "locks" or "deadlocks" after 9.3.10 is 
one change in 9.4.1 that says "Avoid possible deadlock while trying to acquire 
tuple locks in EvalPlanQual processing" - but I guess that does not refer to a 
deadlock on "user level".

Any ideas?
Thomas






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to Qualifying or quantify risk of loss in asynchronous replication

2016-03-16 Thread Thomas Munro
On Wed, Mar 16, 2016 at 9:59 PM, otheus uibk  wrote:
>> In asynchronous replication,
>> the primary writes to the WAL and flushes the disk.  Then, for any
>> standbys that happen to be connected, a WAL sender process trundles
>> along behind feeding new WAL doesn the socket as soon as it can, but
>> it can be running arbitrarily far behind or not running at all (the
>> network could be down or saturated, the standby could be temporarily
>> down or up but not reading the stream fast enough, etc etc).
>
>
>
> This is the *process* I want more detail about. The question is the same as
> above:
>> (is it true that) PG async guarantees that the WAL
>> is *sent* to the receivers, but not that they are received, before the
>> client receives acknowledgement?

The primary writes WAL to disk, and then wakes up walsender processes,
and they read the WAL from disk (presumably straight out of the OS
page cache) in the background and send it down the network some time
later.  Async replication doesn't guarantee anything about the WAL
being sent.

Look for WalSndWakeupRequest() in xlog.c, which expands to a call to
WalSndWakeup in walsender.c which sets latches (= a mechanism for
waking processes) on all walsenders, and see the WaitLatchOrSocket
calls in walsender.c which wait for that to happen.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to Qualifying or quantify risk of loss in asynchronous replication

2016-03-16 Thread otheus uibk
Apologies for the double-reply... This is to point out the ambiguity
between the example you gave and stated documentation.

On Wednesday, March 16, 2016, Thomas Munro 
wrote:

>
> Waiting for the transaction to be durably stored (flushed to disk) on
> two servers before COMMIT returns means that you can avoid this
> situation:
>
> 1.  You commit a transaction, and COMMIT returns as soon as the WAL is
> flushed to disk on the primary.
> 2.  You communicate a fact based on that transaction to a third party
> ("Thank you Dr Bowman, you are booked in seat A4, your reservation
> number is JUPITER123").
> 3.  Your primary computer is destroyed by a meteor, and its WAL sender
> hadn't yet got around to sending that transaction to the standby


Section 25.2.5. "The standby connects to the primary, which streams WAL
records to the standby as they're generated, without waiting for the WAL
file to be filled."

This suggests that the record is on the network stack possibly before a
flush to disk.

 Section 25.2.6 "If the primary server crashes then some transactions that
were committed may not have been replicated to the standby server, causing
data loss. The amount of data loss is proportional to the replication delay
at the time of failover."

Whence this replication delay? If the standby server is caught up and
streaming asynchronously, what delays *in receiving* might there be other
than network delays?

Note: I am totally unconcerned with the possibility that both primary and
standby go down at the same time.


-- 
Otheus
otheus.u...@gmail.com
otheus.shell...@uibk.ac.at


Re: [GENERAL] How to Qualifying or quantify risk of loss in asynchronous replication

2016-03-16 Thread otheus uibk
Thomas, thanks for your input... But I'm not quite getting the answer I
need

> But what precisely is the algorithm and timing involved with streaming
> WALs?
> >
> > Is it:
> >   * client issues COMMIT
> >   * master receives commit
> >   * master processes transaction internally
> >   * master creates WAL record
> >   | master appends WAL to local WAL file, flushes disk
> >   | master sends WAL record to all streaming clients
> >   * master sends OK to client
> >   * master applies WAL
> >
> > So is this correct? Is it correct to say: PG async guarantees that the
> WAL
> > is *sent* to the receivers, but not that they are received, before the
> > client receives acknowledgement?
>
> Async replication doesn't guarantee anything at all about receivers,
> or even that there is one right at this moment.  Did you mean to write
> "synchronous" instead of "asynchronous"?


I'm only concerned with async (for this thread).


In asynchronous replication,
> the primary writes to the WAL and flushes the disk.  Then, for any
> standbys that happen to be connected, a WAL sender process trundles
> along behind feeding new WAL doesn the socket as soon as it can, but
> it can be running arbitrarily far behind or not running at all (the
> network could be down or saturated, the standby could be temporarily
> down or up but not reading the stream fast enough, etc etc).



This is the *process* I want more detail about. The question is the same as
above:
> (is it true that) PG async guarantees that the WAL
> is *sent* to the receivers, but not that they are received, before the
> client receives acknowledgement?

But I will refine what I mean by "sent"... does PostgreSQL write the WAL to
the socket and  flush the socket before acknowledging the transaction to
the client? Does it *always* do this? Or does it make a best effort? Or
does the write to the socket and return to client happen asynchronously?

I realize that the data might not be *seen* at the client, i realize
network buffers may take time to reach the network, I realize various
levels of synchronous replication provide higher guarantees.  But For the
purposes of this topic, I'm interest to know what PG actually does. I can't
tell that from the documentation (because it is not clearly stated and
because it is self contradictory).






-- 
Otheus
otheus.u...@gmail.com
otheus.shell...@uibk.ac.at