Re: Serialization without Enque

Kenneth Wilkerson Thu, 14 Nov 2013 05:20:11 -0800

I'm with you on patents. I came across an IBM patent yesterday that was
dated 2009  describing a lock-free storage manager using cell pools  to
manage variable length storage. I invented my first one 30 years ago using
CAS. It was writing a more sophisticated version 10 years ago that led to my
research on PLO.  I'm now on my fourth version of this storage manager, this
one I wrote for me,  and it's much more sophisticated than the patented
algorithm with numerous more capabilities.


Like music, every piece of software is based on what we've seen before.  So
my real question with software patents is how do you prove it's original?
It's the chicken and the egg. Which came first?

Kenneth

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of David Crayford
Sent: Thursday, November 14, 2013 6:40 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 14/11/2013 12:23 AM, Kenneth Wilkerson wrote:
> If I read the article you sent correctly, this algorithm is using a 
> spin lock. It has provision for implementing a lock-free algorithm but 
> none of those are detailed.

Most of the shared_ptr implementations in the wild, including z/OS, use a
lock-free algorithm invented by Alexander Terekhov (IBM). An atomic
reference counted smart pointer is a nice tool to have in your bag.

> PLO has no restrictions other than the limits of my imagination.

Or patents! I notice IBM have quite a few wrt PLO. This one for example
http://www.google.com.mx/patents/US6636883. The US patent office seems to
like patents related to multi-processing. Maybe they think it's novel.

Software patents have a funky smell!

>
> However, in the last sentence of the performance section, the authors 
> state "Note that we make no claim that the atomic access approach 
> unconditionally outperforms the others, merely that it may be best for 
> certain scenarios that can be encountered in practice.". I think this 
> is the key to lock-free implementations. I have generalized primitive 
> functions to perform most of my PLO operations. But I design and tune 
> each to the specific use. I understand the desire to provide 
> generalized functionality for high level language use. However, I do 
> not accept the premise that all lists are the same. And I would 
> certainly use different algorithms for lists that I expected to get 
> "large" and lists that have more than a small percentage of update
operations.

While it may not be true on z/OS, most software developers these days use
high level languages for multi-threaded applications and prefer abstractions
because they are easy to use and reason about. Of course, that doesn't mean
that they shouldn't understand what's happening under the covers.  The trick
is keeping the optimizer out of the mix which is where inline assembler
comes in handy.

There are many high-quality (free) libraries for lock-free data structures
that are relatively easy to port to z/OS. Using advanced language features
it's quite simple to configure a highly granular concurrent queue by using
policies. The difficult part is testing the bloody things!

typedef mpmc_queue< fixed_array, lock_free, smart_ptr > multi_queue; typedef
spsc_queue< priority_list, mutex, unique_ptr > blocking_queue; typedef
thread_pool< multi_queue, max_threads<8> > task_pool;

socket_server<task_pool> server;


> Kenneth
> ----------------------------------------------------------------------
> ------
> ----
>
>
> -----Original Message-----
> From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
> On Behalf Of David Crayford
> Sent: Tuesday, November 12, 2013 11:45 PM
> To: IBM-MAIN@LISTSERV.UA.EDU
> Subject: Re: Serialization without Enque
>
> On 13/11/2013 12:34 PM, Kenneth Wilkerson wrote:
>> Actually, the algorithm performs well for read-often, write-rarely 
>> list because the active chain count does not change and therefore 
>> there are relatively infrequent re-drives.  The active chain count 
>> only changes on an add or delete. So if there are infrequent adds and 
>> deletes, there will be infrequent re-drives. And you are wrong, 
>> readers will not contend unless two or more tasks are referencing the 
>> exact same element simultaneously. And even then, the re-drive only
> involves the update to the use count.
>
> Thanks,  I get it now. Maybe IBM should have used PLO for the z/OS C++ 
> shared_ptr SMR algorithm which has both weak/strong reference counts + 
> the pointer. They use a lock-free algorithm using CS 
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2674.htm.
> shared_ptr implements proxy based garbage collection.
>
>> There are a lot of optimization that I do not describe in this 
>> algorithm for simplicity. For example, when you do a PLO DCAS, the 
>> condition code is set to indicate which compare failed. This can be 
>> used
> to optimize the re-drive.
>> There are many others optimizations with re-drives to make this more 
>> efficient. And if you get away from traditional lists, there are even 
>> more optimizations.
> Like what? What do you replace linked lists with, arrays with offsets 
> instead of pointers?
>
>> Honestly, I provided this algorithm after reading the paper on hazard 
>> pointers. The paper was written in 2002 and claimed there was no 
>> atomic DCAS when PLO DCAS  became available in 2001. So I took a much 
>> simpler algorithm that I had and modified it to use a use count to 
>> accommodate traditional storage managers to prove that a PLO could be 
>> used to manage a conventional list using a traditional storage 
>> manager and provide a SMR algorithm without the need for DU level 
>> management structures. I don't use many conventional lists and I have 
>> a proprietary
> storage manager that does not use chains.
>> Most of my PLO operations are much simpler.
>> I would love to test this algorithm against any other SMR algorithm.
>> My point has been and remains, that PLO can be efficiently used to 
>> serialize any list in a lock-free manner and even if it does take 
>> more CPU this will be offset by increased throughput.
>>
>> And just because UNIX has issues with PLO doesn't mean the issue is 
>> with PLO...
> UNIX doesn't have an issue with PLO. It clearly states that 
> popping/pushing elements at the beginning of the queue is a good 
> performer. Surely your algorithm would have the same problem if 
> multiple producers/consumers were inserting/removing elements from the
middle of a long list.
>
>> Kenneth
>>
>> -----Original Message-----
>> From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
>> On Behalf Of David Crayford
>> Sent: Tuesday, November 12, 2013 8:39 PM
>> To: IBM-MAIN@LISTSERV.UA.EDU
>> Subject: Re: Serialization without Enque
>>
>> Thanks for sharing your design Ken. It seems to me that PLO is best 
>> used for data structures like double-ended queues where elements can 
>> be inserted/removed from both ends of the queue atomically. In the 
>> case of a read-often-write-rarely list with multiple readers that 
>> traverse the list it doesn't seem optimal. Correct me if I'm wrong 
>> but readers will contend with each other.
>>
>> FYI, z/OS UNIX message queues can be configured to use PLO (with 
>> fallback to a latch). The documentation states that 
>> inserting/removing from the middle of a long list using PLO is a poor 
>> performer 
>> http://pic.dhe.ibm.com/infocenter/zos/v1r12/topic/com.ibm.zos.r12.bpx
>> b
>> 100/qc
>> t.htm#qct.
>>
>>
>> There isn't a one-size-fits-all solution. It depends on the usage
> patterns.
>> Hopefully HTM will solve that. Depending on how Intels Haswell TSX 
>> implementation performs in the wild we could see HTM in our cell 
>> phones as early as next year.
>>
>>
>> On 12/11/2013 11:18 PM, Kenneth Wilkerson wrote:
>>> I use cell pools. I also use a proprietary storage manager that 
>>> doesn't use chains. These methodology offer me capabilities well 
>>> beyond those found in traditional methods. Much of what I do is 
>>> based on these capabilities, but the algorithms could easily be 
>>> adapted to use a conventional storage manager that uses chains.
>>>
>>> Here is an algorithm I use that  I've adopted for traditional 
>>> storage management. This algorithm will work for any list; LIFO,  
>>> FIFO, ordered or other , and for deletion from the head, middle or tail.
>>>
>>> Setup: Allocate a word in each element. Bit 0 is one for all active 
>>> elements. Bit 1 is one for all elements pending deletion. Bit 2 is 
>>> `reserved for an extension to this algorithm (such as garbage 
>>> cleanup). The remaining bits are a use count allowing for many more 
>>> DUs
>> than are supported by MVS.
>>> Note.: When PLO Compare and Swap (CS) or Double Compare and Swap
>>> (DCAS) is referenced, the PLO uses the use count address as the lock 
>>> word. This will serialize all updates to the use counter for that 
>>> element. For the PLO Compare and Loads or PLO update, the lock word 
>>> is the
>> active chain counter.
>>> To search:
>>> Step 1:  Use the PLO Compare and Load on the active chain counter to 
>>> search the chain as before.  If the PLO fails, re-drive the search.
>>>
>>> Step 2:  Before examining the element, increment the use count with 
>>> a PLO Double Compare and Swap (DCAS). Load the first register pair 
>>> with the current chain counter. The swap value will also be the 
>>> current chain counter. Essentially, we're using the active chain 
>>> count to serialize increments  to the use count to avoid accessing 
>>> an area that may have been released. The second register pair will 
>>> contain the current use count with a swap value incremented by 1 
>>> using an AL to avoid resetting the high bit. If the PLO DCAS fails, 
>>> the previous PLO Compare and Load (Step 1) should be re-driven.
>>>
>>> Step 3: Use the PLO Compare and Load for the next element. Save the 
>>> PLO status and decrement the use count with a SL using PLO CS. We 
>>> don't need a DCAS because the use count is not 0 and this element 
>>> can't be deleted.  If this PLO CS fails, re-drive it. If PLO Compare 
>>> and Load status is re-drive, then before re-driving the search, 
>>> check the use count (in the register used to update it). If  Bit 1 
>>> (pending
>>> delete) is set and the use count is 0, this task can release it.
>>>
>>> To delete:  (this assumes the deleting task has already updated the 
>>> use count in the process of finding the element to delete)
>>> Step1: Use a PLO update function to remove the element from the 
>>> active chain to avoid any future references.
>>>
>>> Step2:  If the PLO update  to remove the element fails,  decrement 
>>> the use count but do NOT set bit 1 (pending delete) using a PLO CS.
>>> If the PLO CS fails, re-drive it .
>>>
>>> Step 3: If the PLO update to remove the element succeeds, decrement 
>>> the use count,  SET bit 1 (pending delete) and RESET Bit 0 (active
>>> bit) using a PL CS. If the PLO CS fails, re-drive it .
>>>     
>>> Step 4: Whether the PLO update succeeded or failed, check the use 
>>> count in the register used to update it:
>>>                         If bit 1 (pending delete) is set and the use 
>>> count is not 0, then this task should exit .
>>>                          If bit 1 (pending delete) is set and the 
>>> use count is 0, then this task can release it.
>>>                         Otherwise, this task must re-drive the 
>>> search to find the element to be deleted
>>>
>>> You can work out the various scenarios yourself. But because the 
>>> count is incremented/decremented after a PLO Compare and Load or 
>>> update, the status of the PLO provides a decision point on whether 
>>> an element may have been deleted. Using the use count address as the 
>>> lock word insures that all use count updates for a specific use 
>>> count occur serially. There are numerous extensions to this 
>>> algorithm that are more than I want to describe. Things like adding 
>>> pending deletes to a delete chain or having an asynchronous, 
>>> periodic garbage collection task
>> handle the deletes.
>>> Kenneth
>>> -----Original Message-----
>>> From: IBM Mainframe Discussion List 
>>> [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Jon Perryman
>>> Sent: Monday, November 11, 2013 9:38 PM
>>> To: IBM-MAIN@LISTSERV.UA.EDU
>>> Subject: Re: Serialization without Enque
>>>
>>> Could you provide an insight to a design that would handle the 
>>> situation where an SSI program with non-trivial workload uses a very 
>>> large list. This list is normally semi-static but there can be 
>>> periods of time where entries are heavily deleted and added. Not 
>>> freeing storage is an option because it could be a significant 
>>> amount of
> storage.
>>> Thanks, Jon Perryman.
>>>
>>>
>>>
>>>> ________________________________
>>>> From: Tony Harminc <t...@harminc.net>
>>>> To: IBM-MAIN@LISTSERV.UA.EDU
>>>> Sent: Monday, November 11, 2013 7:07 PM
>>>> Subject: Re: Serialization without Enque
>>>>
>>>>
>>>> On 11 November 2013 20:15, Jon Perryman <jperr...@pacbell.net> wrote:
>>>>>          L    R2,QUEUE
>>>>>          L    R3,NEXT_ENTRY
>>>>>         CS  R2,R3,QUEUE                    New queue head  While this
>>>>> seems bullet proof, it's not. If there is a long delay between 
>>>>> between the L instructions then next entry could  have been freed 
>>>>> causing an
>>> abend.
>>>> If your code is referencing an item that may have been freed, then 
>>>> your design is wrong.
>>>>
>>>>> While the delay is very unlikely, it is still a possiblity (e.g.
>>>>> swapout after first "L"). This example is just removing the first
> entry.
>>> If you try to remove an entry in the middle of the chain, you run a 
>>> larger risk.
>>>>> Some of the techniques we use to minimize risk:
>>>>> 1. Abend recovery
>>>>> 2. don't overload the entry - only contains shared information.
>>>>> 3. Place entry on a timeout queue before placing on a free queue 
>>>>> or
>>> freeing.
>>>>> 4. Copy the entry to workarea - reduces reference time of actual 
>>>>> entry 5. Eyecatchers that can be verified.
>>>>> 6. Unique identifier for each entry that can be verified.
>>>>> 7. Turn off interrupts.
>>>> Most of these are indicative of incorrect design rather than of 
>>>> careful attention to risk. If it's unclear whether a queue entry is 
>>>> eligible to be worked on or even to be looked at unless you do it 
>>>> in a hurry, then there is something quite wrong.
>>>>
>>>> By all means provide abend recovery, but don't make it part of the 
>>>> queue handling design.
>>>>
>>>> Tony H.
>>>>
>>>> -------------------------------------------------------------------
>>>> -
>>>> -
>>>> - For IBM-MAIN subscribe / signoff / archive access instructions, 
>>>> send email to lists...@listserv.ua.edu with the message: INFO 
>>>> IBM-MAIN
>>>>
>>>>
>>>>
>>> --------------------------------------------------------------------
>>> -
>>> - For IBM-MAIN subscribe / signoff / archive access instructions, 
>>> send email to lists...@listserv.ua.edu with the message: INFO 
>>> IBM-MAIN
>>>
>>> --------------------------------------------------------------------
>>> -
>>> - For IBM-MAIN subscribe / signoff / archive access instructions, 
>>> send email to lists...@listserv.ua.edu with the message: INFO 
>>> IBM-MAIN
>> ---------------------------------------------------------------------
>> - For IBM-MAIN subscribe / signoff / archive access instructions, 
>> send email to lists...@listserv.ua.edu with the message: INFO 
>> IBM-MAIN
>>
>> ---------------------------------------------------------------------
>> - For IBM-MAIN subscribe / signoff / archive access instructions, 
>> send email to lists...@listserv.ua.edu with the message: INFO 
>> IBM-MAIN
> ----------------------------------------------------------------------
> For IBM-MAIN subscribe / signoff / archive access instructions, send 
> email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
>
> ----------------------------------------------------------------------
> For IBM-MAIN subscribe / signoff / archive access instructions, send 
> email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

Reply via email to