[lock-free] Re: 3-thread consensus.

2020-01-12 Thread Dmitry Vyukov
On Sun, Jan 12, 2020 at 7:33 PM bittnkr  wrote:
>
>
> > This blocks consumes from progressing (consuming next produced items)
> and is effectively a mutex.
>
> Suppose the thread A got a local copy of tail in t then is preempted,
>
> another thread will get the same tail an get the seat normally.
>
> When the previous thread retakes the line, the CAS will fail, because the 
> seat was taken.
>
> Restarting the while without any kind of blocking.
>
> Where do you see a mutex here?

I mean preemption between succeeding CAS and writing element /NULL to the array.

If a thread is terminated at that point, the whole queue is broken (no
termination-safety).


> Em dom, 12 de jan de 2020 04:49, Dmitry Vyukov  escreveu:
>>
>> On Sat, Jan 11, 2020 at 9:26 AM bittnkr  wrote:
>> >
>> > Good observations. Thank you.
>> >
>> > If the thread preempts on those points, the seat position will be held on 
>> > local variables h and t.
>>
>> This blocks consumes from progressing (consuming next produced items)
>> and is effectively a mutex. This makes algorithm non-termination-safe
>> and non-lock-free.
>>
>>
>> > After the thread line is restored, the CAS can fail, and the loop will 
>> > just restart in normal flow, without any locking.
>> >
>> > I updated the docs, I think is clearer now.
>> >
>> > Em sáb., 11 de jan. de 2020 às 05:38, Dmitry Vyukov  
>> > escreveu:
>> >>
>> >> On Sat, Jan 11, 2020 at 4:09 AM bittnkr  wrote:
>> >> >
>> >> > Hello Dmitry and fellows from the group.
>> >> >
>> >> > If you look carefully, you will see that there is no blocking, just 
>> >> > simple CAS, even the buffer is a common buffer.
>> >>
>> >> But consider you look inside of a spin lock, or you take a
>> >> spin-lock-based algorithm and inline all spin lock code into the
>> >> algorithm. What you will see is exactly the same -- no blocking, just
>> >> a CAS. However, it does not really change anything, it's still
>> >> blocking mutex-based algorithm.
>> >> One can also combine spin lock state with some data, e.g. a particular
>> >> value of a data member means "locked" and blocks progress of other
>> >> threads. It makes spin lock even less visible, but again does not
>> >> change anything on conceptual level.
>> >>
>> >> If I am reading the algorithm correctly, if a thread is preempted
>> >> between these 2 operations:
>> >>
>> >>  } while ( (data[t & mask]) || (CompareExchange(tail, t+1, t) != t) )
>> >>   data[t & mask] = item // now is safe to update the buffer
>> >>
>> >> It effectively locks a mutex on the queue and blocks progress of all 
>> >> consumers.
>> >>
>> >> There is also a similar blocks on consumer side here:
>> >>
>> >>  } while ( !(data[h & mask]) || CompareExchange(head, h+1, h) != h )
>> >>   data[h] = 0 // release the seat
>> >>
>> >>
>> >> > The queue only sleeps if it is empty out of it is full. But this is not 
>> >> > locking.
>> >> >
>> >> > All protection is done by the two atomic variables head an tail.
>> >> >
>> >> > The cost is constant near a single CAS. For any number of threads.
>> >> >
>> >> > Take a look on this benchmarks. They speaks for themselves.
>> >> >
>> >> > https://github.com/bittnkr/uniq/blob/master/README.md#benchmarks
>> >> >
>> >> > Besides I don't have a formal proof, we have a test with zero checksum.
>> >> >
>> >> > This is the results I have for a producer/consumer test pushing random 
>> >> > numbers is the queue.
>> >> >
>> >> > Creating 4 producers & 4 consumers
>> >> > to flow 10.000.000 items trough the queue.
>> >> >
>> >> > Produced: 10.743.668.245.000.000
>> >> > Consumed: 5.554.289.678.184.004
>> >> > Produced: 10.743.668.245.000.000
>> >> > Consumed: 15.217.833.969.059.643
>> >> > Produced: 10.743.668.245.000.000
>> >> > Consumed: 7.380.542.769.600.801
>> >> > Produced: 10.743.668.245.000.000
>> >> > Consumed: 14.822.006.563.155.552
>> >> >
>> >> > Checksum: 0 (it must be zero)
>> >> >
>> >> > The producers increase total and the consumers decrease. The result for 
>> >> > 10M random numbers is zero.
>> >> >
>> >> > Thanks for the advices, I'll investigate about this tools.
>> >> >
>> >> > Em qui, 9 de jan de 2020 04:58, Dmitry Vyukov  
>> >> > escreveu:
>> >> >>
>> >> >> On Wed, Jan 8, 2020 at 9:29 PM bittnkr  wrote:
>> >> >> >
>> >> >> > Dear Dmitry,
>> >> >> >
>> >> >> > I found a nice solution for the problem called 3 thread consensus, 
>> >> >> > considered impossible on the book The art of the multiprocessor 
>> >> >> > programming. I think that is a breakthrough.
>> >> >> >
>> >> >> > Debating on S.O. with someone about if the solution is solid or no, 
>> >> >> > If it is possible to occur data races, he referred relacy.
>> >> >> >
>> >> >> > So I'm writing you to ask your opinion about the solution.
>> >> >> >
>> >> >> > Can you take a little look on it?
>> >> >>
>> >> >> +lock-free mailing list
>> >> >>
>> >> >> Hi bittnkr,
>> >> >>
>> >> >> At first glance the algorithm at https://github.com/bittnkr/uniq looks
>> >> >> blocking and non-linearizable to me.
>> >

[lock-free] Re: 3-thread consensus.

2020-01-12 Thread Dmitry Vyukov
+lock-free group again, please don't drop it

On Mon, Jan 13, 2020 at 8:19 AM bittnkr  wrote:
>
> If a thread dies at that point, my entire system is broken...

Which means it's not lock-free.

> But It can preempts without any problem at near zero cost.

Well, if producer is preempted, consumers are blocked from
progressing. This is 100% equivalent to a mutex. If a thread is
preempted while holding a mutex, it also does not result in any
correctness problems.


> Em seg, 13 de jan de 2020 03:55, Dmitry Vyukov  escreveu:
>>
>> On Sun, Jan 12, 2020 at 7:33 PM bittnkr  wrote:
>> >
>> >
>> > > This blocks consumes from progressing (consuming next produced items)
>> > and is effectively a mutex.
>> >
>> > Suppose the thread A got a local copy of tail in t then is preempted,
>> >
>> > another thread will get the same tail an get the seat normally.
>> >
>> > When the previous thread retakes the line, the CAS will fail, because the 
>> > seat was taken.
>> >
>> > Restarting the while without any kind of blocking.
>> >
>> > Where do you see a mutex here?
>>
>> I mean preemption between succeeding CAS and writing element /NULL to the 
>> array.
>>
>> If a thread is terminated at that point, the whole queue is broken (no
>> termination-safety).
>>
>>
>> > Em dom, 12 de jan de 2020 04:49, Dmitry Vyukov  
>> > escreveu:
>> >>
>> >> On Sat, Jan 11, 2020 at 9:26 AM bittnkr  wrote:
>> >> >
>> >> > Good observations. Thank you.
>> >> >
>> >> > If the thread preempts on those points, the seat position will be held 
>> >> > on local variables h and t.
>> >>
>> >> This blocks consumes from progressing (consuming next produced items)
>> >> and is effectively a mutex. This makes algorithm non-termination-safe
>> >> and non-lock-free.
>> >>
>> >>
>> >> > After the thread line is restored, the CAS can fail, and the loop will 
>> >> > just restart in normal flow, without any locking.
>> >> >
>> >> > I updated the docs, I think is clearer now.
>> >> >
>> >> > Em sáb., 11 de jan. de 2020 às 05:38, Dmitry Vyukov  
>> >> > escreveu:
>> >> >>
>> >> >> On Sat, Jan 11, 2020 at 4:09 AM bittnkr  wrote:
>> >> >> >
>> >> >> > Hello Dmitry and fellows from the group.
>> >> >> >
>> >> >> > If you look carefully, you will see that there is no blocking, just 
>> >> >> > simple CAS, even the buffer is a common buffer.
>> >> >>
>> >> >> But consider you look inside of a spin lock, or you take a
>> >> >> spin-lock-based algorithm and inline all spin lock code into the
>> >> >> algorithm. What you will see is exactly the same -- no blocking, just
>> >> >> a CAS. However, it does not really change anything, it's still
>> >> >> blocking mutex-based algorithm.
>> >> >> One can also combine spin lock state with some data, e.g. a particular
>> >> >> value of a data member means "locked" and blocks progress of other
>> >> >> threads. It makes spin lock even less visible, but again does not
>> >> >> change anything on conceptual level.
>> >> >>
>> >> >> If I am reading the algorithm correctly, if a thread is preempted
>> >> >> between these 2 operations:
>> >> >>
>> >> >>  } while ( (data[t & mask]) || (CompareExchange(tail, t+1, t) != t) )
>> >> >>   data[t & mask] = item // now is safe to update the buffer
>> >> >>
>> >> >> It effectively locks a mutex on the queue and blocks progress of all 
>> >> >> consumers.
>> >> >>
>> >> >> There is also a similar blocks on consumer side here:
>> >> >>
>> >> >>  } while ( !(data[h & mask]) || CompareExchange(head, h+1, h) != h )
>> >> >>   data[h] = 0 // release the seat
>> >> >>
>> >> >>
>> >> >> > The queue only sleeps if it is empty out of it is full. But this is 
>> >> >> > not locking.
>> >> >> >
>> >> >> > All protection is done by the two atomic variables head an tail.
>> >> >> >
>> >> >> > The cost is constant near a single CAS. For any number of threads.
>> >> >> >
>> >> >> > Take a look on this benchmarks. They speaks for themselves.
>> >> >> >
>> >> >> > https://github.com/bittnkr/uniq/blob/master/README.md#benchmarks
>> >> >> >
>> >> >> > Besides I don't have a formal proof, we have a test with zero 
>> >> >> > checksum.
>> >> >> >
>> >> >> > This is the results I have for a producer/consumer test pushing 
>> >> >> > random numbers is the queue.
>> >> >> >
>> >> >> > Creating 4 producers & 4 consumers
>> >> >> > to flow 10.000.000 items trough the queue.
>> >> >> >
>> >> >> > Produced: 10.743.668.245.000.000
>> >> >> > Consumed: 5.554.289.678.184.004
>> >> >> > Produced: 10.743.668.245.000.000
>> >> >> > Consumed: 15.217.833.969.059.643
>> >> >> > Produced: 10.743.668.245.000.000
>> >> >> > Consumed: 7.380.542.769.600.801
>> >> >> > Produced: 10.743.668.245.000.000
>> >> >> > Consumed: 14.822.006.563.155.552
>> >> >> >
>> >> >> > Checksum: 0 (it must be zero)
>> >> >> >
>> >> >> > The producers increase total and the consumers decrease. The result 
>> >> >> > for 10M random numbers is zero.
>> >> >> >
>> >> >> > Thanks for the advices, I'll investigate about this tools.
>> >> >> >
>> >> >> > Em qui, 9