Re: [PATCH v2 0/2] Refactor snapshot vs nocow writers locking

2019-07-30 Thread Valentin Schneider
And with a more accurate spec (appended at the end of this email):

Model checking completed. No error has been found.
  Estimates of the probability that TLC did not check all reachable states
  because two distinct states had the same fingerprint:
  calculated (optimistic):  val = 8.6E-8
  based on the actual fingerprints:  val = 9.8E-9
3100306 states generated, 651251 distinct states found, 0 states left on 
queue.

IOW, it seems fine in regards to the defined properties + the deadlock
check.

The EventuallyRead liveness property shows that a waiting reader will
always eventually get the lock (no reader starvation), of course assuming
no lock-user blocks in its critical section (i.e. no stuttering steps).

It doesn't hold for writers - they can be starved by a never-ending stream
of readers. IOW

  \* Eventually one writer gets the lock
  <> \E writer \in WRITERS: pc[writer] = "write_cs"

can be disproven by some behaviours. However,

  \* No writer ever gets the lock
  [] \A writer \in WRITERS: pc[writer] # "write_cs"

can be disproven as well - there are *some* paths that allow writers to
get the lock. I haven't found a neater way to check that other than by
having a "debug" property that I don't include in the full-fledged check.



Note that the entire content of a label is considered atomic by TLC. From
the point of view of that spec:

Read lock:
- reader count increment is atomic
- writer count read is atomic

Read unlock:
- reader count decrement is atomic
(The model's writer counter read is in the same atomic block as the
reader count decrement, but it's only used for updating a model-internal
variable, so I'm not including it here)

Write lock:
- write count increment is atomic
- reader count read is atomic
- writer count decrement is atomic

Write unlock:
- writer count increment is atomic
(ditto on the reader counter read)

This doesn't help for the placement of memory barriers, but from an
algorithm point of view that seems to be holding up.



Here's the actual spec (if I keep this up I'll get a git tree going...)
---
specs.tla:

 MODULE specs 
EXTENDS Integers, Sequences, FiniteSets, TLC

CONSTANTS
NR_WRITERS,
NR_READERS,
WRITER_TASK,
READER_TASK

WRITERS == {WRITER_TASK} \X (1..NR_WRITERS)
READERS == {READER_TASK} \X (1..NR_READERS)
THREADS == WRITERS \union READERS

(*--algorithm DRW {

variables
lock_state = "idle",
reader_count = 0,
writer_count = 0

define {

LOCK_STATES == {"idle", "write_locked", "read_locked"}

(* Safety invariants *)
TypeCheck ==
/\ lock_state \in LOCK_STATES
/\ reader_count \in (0..NR_READERS)
/\ writer_count \in (0..NR_WRITERS)

(* Ensure critical section exclusiveness *)
Exclusion ==
/\ \E writer \in WRITERS: pc[writer] = "write_cs" =>
\A reader \in READERS: pc[reader] # "read_cs"
/\ \E reader \in READERS: pc[reader] = "read_cs" =>
\A writer \in WRITERS: pc[writer] # "write_cs"

(* Ensure the lock state is sane *)
LockState ==
/\ lock_state = "idle" =>
(* Not much can be said about the counts - the whole range is valid *)
/\ \A writer \in WRITERS: pc[writer] # "write_cs"
/\ \A reader \in READERS: pc[reader] # "read_cs"
/\ lock_state = "read_locked" =>
(* Some readers can still be in the locking procedure, just make sure  
*)
(* all readers in their critical section are accounted for *)
reader_count >= Cardinality({r \in READERS: pc[r] = "read_cs"})
/\ lock_state = "write_locked" =>
(* Ditto for writers *)
writer_count >= Cardinality({w \in WRITERS: pc[w] = "write_cs"})

(* Liveness properties *)

(* A waiting reader *always* eventually gets the lock *)
EventuallyRead ==
reader_count > 0 /\ lock_state # "read_locked" ~>
lock_state = "read_locked"

(* A waiting writer *can* eventually get the lock *)
EventuallyWrite ==
(* TODO: find a way to express that this can be false in some behaviours *)
(*   but has to be true in > 0 behaviours *)
TRUE
}

macro ReadUnlock()
{
reader_count := reader_count - 1;
\* Condition variable signal is "implicit" here
if (reader_count = 0) {
lock_state := "idle";
};
}

macro WriteUnlock()
{
writer_count := writer_count - 1;
\* Ditto on the cond var
if (writer_count = 0) {
lock_state := "idle";
};
}

procedure ReadLock()
{
add:
reader_count := reader_count + 1;
lock:
await writer_count = 0;
lock_state := "read_locked";
return;
}

procedure WriteLock()
variables rc;
{
loop:
while (TRUE) {
writer_count := writer_count + 1;
get_readers:
rc := reader_count;
check_readers:
if (rc > 0) {
writer_count := writer_count - 1;
wait:
await reader_count = 0;
} else {
goto locked;
};
};

locked:
lock_state := "write_locked";
retur

Re: [PATCH v2 0/2] Refactor snapshot vs nocow writers locking

2019-07-29 Thread Catalin Marinas
Some nitpicking below:

On Mon, Jul 29, 2019 at 03:13:42PM +0100, Valentin Schneider wrote:
> specs.tla:
> 
>  MODULE specs 
> EXTENDS Integers, Sequences, TLC
> 
> CONSTANTS
> NR_WRITERS,
> NR_READERS,
> WRITER_TASK,
> READER_TASK
> 
> WRITERS == {WRITER_TASK} \X (1..NR_WRITERS)
> READERS == {READER_TASK} \X (1..NR_READERS)
> THREADS == WRITERS \union READERS

Recommendation: use symbolic values for WRITERS and READERS (defined in
.cfg: e.g. r1, r2, r3, w1, w2, w2). It allows you do to symmetry
optimisations. We've also hit a TLC bug in the past with process values
made up of a Cartesian product (though it may have been fixed since).

> macro ReadLock(tid)
> {
> if (lock_state = "idle" \/ lock_state = "read_locked") {
> lock_state := "read_locked";
> threads[tid] := "read_locked";
> } else {
> assert lock_state = "write_locked";
> \* waiting for writers to finish
> threads[tid] := "write_waiting";
> await lock_state = "" \/ lock_state = "read_locked";

lock_state = "idle"?

> macro WriteLock(tid)
> {
> if (lock_state = "idle" \/ lock_state = "write_locked") {
> lock_state := "write_locked";
> threads[tid] := "write_locked";
> } else {
> assert lock_state = "read_locked";
> \* waiting for readers to finish
> threads[tid] := "read_waiting";
> await lock_state = "idle" \/ lock_state = "write_locked";
> };
> }

I'd say that's one of the pitfalls of PlusCal. The above is executed
atomically, so you'd have the lock_state read and updated in the same
action. Looking at the C patches, there is an
atomic_read(&lock->readers) followed by a
percpu_counter_inc(&lock->writers). Between these two, you can have
"readers" becoming non-zero via a different CPU.

My suggestion would be to use procedures with labels to express the
non-atomicity of such sequences.

> macro ReadUnlock(tid) {
> if (threads[tid] = "read_locked") {
> threads[tid] := "idle";
> if (\A thread \in THREADS: threads[thread] # "read_locked") {
> \* we were the last read holder, everyone else should be waiting, 
> unlock the lock
> lock_state := "idle";
> };
> };
> }

I'd make this close to the proposed C code with atomic counters. You'd
not be able to check each thread atomically in practice anyway.

-- 
Catalin