Re: Serialization without Enque

2013-11-19 Thread Shane Ginnane
On Mon, 18 Nov 2013 22:34:57 -0800, Ed Jaffe wrote:

On 11/10/2013 1:19 PM, Mark Zelden wrote:
 I've not been paying that close of attention, but I'm more curious
 about what people did for these situations prior to PLO.

ENQ/DEQ or Latch

I can still vividly remember first reading PLO in the Pops- and thinking
bloody hell - umpteen pages 2-up to describe (nominally) just one instruction 
...

Shane ...

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-19 Thread Rob Scott
It's not the number of instructions but what you do with them that counts.

Or something like that...


Rob Scott
Lead Developer
Rocket Software
77 Fourth Avenue . Suite 100 . Waltham . MA 02451-1468 . USA
Tel: +1.781.684.2305
Email: rsc...@rs.com
Web: www.rocketsoftware.com

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Shane Ginnane
Sent: 19 November 2013 09:18
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On Mon, 18 Nov 2013 22:34:57 -0800, Ed Jaffe wrote:

On 11/10/2013 1:19 PM, Mark Zelden wrote:
 I've not been paying that close of attention, but I'm more curious 
 about what people did for these situations prior to PLO.

ENQ/DEQ or Latch

I can still vividly remember first reading PLO in the Pops- and thinking 
bloody hell - umpteen pages 2-up to describe (nominally) just one instruction 
...

Shane ...

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-19 Thread John Gilmore
Long-winded and ugly but functionally adequate serialization machinery
can be developed using TS alone

PLO is convenient and should be more perspicuous; but this thread and
several related predecessors strongly suggest that more often it is
caviar to the general.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-19 Thread Anne Lynn Wheeler
jwgli...@gmail.com (John Gilmore) writes:
 Long-winded and ugly but functionally adequate serialization machinery
 can be developed using TS alone

that was the argument that the POK favorite son operating system people
used when attempt was made to add comapre-and-swap to 370.

charlie had invented compare-and-swap while doing fine grain
multiprocessing locking for cp67 at the cambridge science center.
attempts to add it to 370 was met by opposition from the POK favorite
son operating system (having used single TS spin-lock for entering
os/360 360/65mp kernel). The owners of the 370 architecture then said in
order to justify compare-and-swap for 370, non-multiprocessor use cases
would be needed.

thus was born the non-multiprocessor specific use cases that still
appear in the principle of operations. TS use case was multiprocessor
set a lock serizliation followed by some operation (critical section)
followed by clearing the lock. compare-and-swap is single serialized
non-interruptable, atomic operation. The compare-and-swap use cases
include non-kernel, interruptable, multi-threaded (not necessarily
multiprocessor) operation performing atomic serialized operations w/o
the overhead of making kernel call to perform serialized operation.
misc. past posts mentioning multiprocessing and/or compare-and-swap
http://www.garlic.com/~lynn/subtopic.html#smp

compare-and-swap was fairly quickly adopted by large multi-threaded
applications like high throughput DBMS systems. compare-and-swap was so
useful, numerous other hardware platforms adopted it also (or
instructions with similar atomic semantics).

ibm's 801/risc architecture was originally developed in the period
around Future System (failed new machine architecture that was going to
completely replace 360/370) ... and i've frequently claimed that there
was a lot of 801 objectives to the extreme opposite of the FS
complexity.  One of the issues not to have cache consistency ... to
avoid the enormous performance penalty paid by FS mutliprocessor (and
even the extreme throughput penalty paid for 370 strong memory model
multiprocessor cache consistency). misc. past posts mentioning future
system 
http://www.garlic.com/~lynn/submain.html#futuresys

No cache-consistency pretty much ruled out multiprocessor operation
... as well as philosophy of all instructions needed to complete in
single cycle ... ruled out compare-and-swap instruction.

however, the lack of compare-and-swap instruction ... put the RS/6000 at
severe throughput disadvantage with open system RDBMS benchmarks
compared to other platforms (open system RDBMS had fall-back to kernal
call locking for few hardware platforms that didn't have
compare-and-swap semantics).

fairly early, compare-and-swap instruction emulation was added to the
rs/6000 AIX system call FLIH ... within a couple instructions of entry
to system call FLIH ... there was special case for compare-and-swap
emulation that then immediately returned to application. While it wasn't
useful for real multiprocessor operation, it did achieve the objective
not being interruptable while emulation processing was in progress.

misc. past posts mentioning 801/risc
http://www.garlic.com/~lynn/subtopic.html#801

-- 
virtualization experience starting Jan1968, online at home since Mar1970

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-19 Thread Lindy Mayfield
I have often thought about that but I don't quite understand why it is needed.  
Would you give an example of a problem that is fixed by this method?

Thanks!
Lindy

From: IBM Mainframe Discussion List [IBM-MAIN@LISTSERV.UA.EDU] on behalf of 
Anne  Lynn Wheeler [l...@garlic.com]
Sent: 19 November 2013 16:49
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

charlie had invented compare-and-swap while doing fine grain
multiprocessing locking for cp67 at the cambridge science center.
--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-19 Thread John Gilmore
It is always hard to make oneself understood in emotion-laden
situations, but I did try.  I prefaced my comment about the logical
adequacy of TS with the words

Long-winded and ugly . . .

Perhaps an analogy will help.  Boolean algebra is usually discussed
using the binary operations disjunction (inclusive-or) and conjunction
and the singulary operation negation.  It is, however, possible to
shown that one binary operation, either NOR or NAND suffices.

This is intellectually interesting.  It is not an argument for doing
boolean algebra using only one imperspicuous operation.  One instead
introduces trivial definitions of disjunction and conjunction in terms
of either NOR or NAND and then proceeds in the usual way.

Analogously, both flavors of Compare and Swap and, now, PLO are very
useful; and the fact that they are logically dispensable is NOT an
argument for dispensing with them.

Multiplication is dispensable too.  It can be replaced by successive
additions.  This is not, however, an argument for eliminating multiply
instructions.  Small boys are likely to cut themselves when they start
hacking about with Ockham's razor.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-19 Thread Shmuel Metz (Seymour J.)
In
CAE1XxDGQkTVXZVP3fD-4=jv1ufzxijwt8_dxmrvw+gra6tn...@mail.gmail.com,
on 11/19/2013
   at 07:29 AM, John Gilmore jwgli...@gmail.com said:

Long-winded and ugly but functionally adequate serialization
machinery can be developed using TS alone

Turing Tar Pit: the situation where everything is possible but
nothing is easy.

I can't recall seeing any situation on a S/370 where CS and CDS
weren't preferable to TS for serialization.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-19 Thread Lindy Mayfield
I've been having fun with this.  Build an entire CPU, assembler, vm, operating 
system, etc from nothing but a Nand gate.  I've gotten as far as building all 
the basic gates plus an ALU, all in a simplified HDL.  CPU, RAM all that to 
follow.  

http://nand2tetris.org/

Lindy

From: IBM Mainframe Discussion List [IBM-MAIN@LISTSERV.UA.EDU] on behalf of 
John Gilmore [jwgli...@gmail.com]
Sent: 19 November 2013 19:11
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Perhaps an analogy will help.  Boolean algebra is usually discussed
using the binary operations disjunction (inclusive-or) and conjunction
and the singulary operation negation.  It is, however, possible to
shown that one binary operation, either NOR or NAND suffices.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-18 Thread Ed Jaffe

On 11/10/2013 1:19 PM, Mark Zelden wrote:
I've not been paying that close of attention, but I'm more curious 
about what people did for these situations prior to PLO.


ENQ/DEQ or Latch

--
Edward E Jaffe
Phoenix Software International, Inc
831 Parkview Drive North
El Segundo, CA 90245
http://www.phoenixsoftware.com/

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-14 Thread David Crayford

On 14/11/2013 12:23 AM, Kenneth Wilkerson wrote:

If I read the article you sent correctly, this algorithm is using a spin
lock. It has provision for implementing a lock-free algorithm but none of
those are detailed.


Most of the shared_ptr implementations in the wild, including z/OS, use 
a lock-free algorithm invented by Alexander Terekhov (IBM). An atomic 
reference counted smart pointer is a nice tool to have in your bag.



PLO has no restrictions other than the limits of
my imagination.


Or patents! I notice IBM have quite a few wrt PLO. This one for example 
http://www.google.com.mx/patents/US6636883. The US patent office seems 
to like patents related to multi-processing. Maybe they think it's novel.


Software patents have a funky smell!



However, in the last sentence of the performance section, the authors state
Note that we make no claim that the atomic access approach unconditionally
outperforms the others, merely that it may be best for certain scenarios
that can be encountered in practice.. I think this is the key to lock-free
implementations. I have generalized primitive functions to perform most of
my PLO operations. But I design and tune each to the specific use. I
understand the desire to provide generalized functionality for high level
language use. However, I do not accept the premise that all lists are the
same. And I would certainly use different algorithms for lists that I
expected to get large and lists that have more than a small percentage of
update operations.


While it may not be true on z/OS, most software developers these days 
use high level languages for multi-threaded applications and prefer 
abstractions because they are easy to use and reason about. Of course, 
that doesn't mean that they shouldn't understand what's happening under 
the covers.  The trick is keeping the optimizer out of the mix which is 
where inline assembler comes in handy.


There are many high-quality (free) libraries for lock-free data 
structures that are relatively easy to port to z/OS. Using advanced 
language features it's quite simple to configure a highly granular 
concurrent queue by using policies. The difficult part is testing the 
bloody things!


typedef mpmc_queue fixed_array, lock_free, smart_ptr  multi_queue;
typedef spsc_queue priority_list, mutex, unique_ptr  blocking_queue;
typedef thread_pool multi_queue, max_threads8  task_pool;

socket_servertask_pool server;



Kenneth




-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of David Crayford
Sent: Tuesday, November 12, 2013 11:45 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 13/11/2013 12:34 PM, Kenneth Wilkerson wrote:

Actually, the algorithm performs well for read-often, write-rarely
list because the active chain count does not change and therefore
there are relatively infrequent re-drives.  The active chain count
only changes on an add or delete. So if there are infrequent adds and
deletes, there will be infrequent re-drives. And you are wrong,
readers will not contend unless two or more tasks are referencing the
exact same element simultaneously. And even then, the re-drive only

involves the update to the use count.

Thanks,  I get it now. Maybe IBM should have used PLO for the z/OS C++
shared_ptr SMR algorithm which has both weak/strong reference counts + the
pointer. They use a lock-free algorithm using CS
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2674.htm.
shared_ptr implements proxy based garbage collection.


There are a lot of optimization that I do not describe in this
algorithm for simplicity. For example, when you do a PLO DCAS, the
condition code is set to indicate which compare failed. This can be used

to optimize the re-drive.

There are many others optimizations with re-drives to make this more
efficient. And if you get away from traditional lists, there are even
more optimizations.

Like what? What do you replace linked lists with, arrays with offsets
instead of pointers?


Honestly, I provided this algorithm after reading the paper on hazard
pointers. The paper was written in 2002 and claimed there was no
atomic DCAS when PLO DCAS  became available in 2001. So I took a much
simpler algorithm that I had and modified it to use a use count to
accommodate traditional storage managers to prove that a PLO could be
used to manage a conventional list using a traditional storage manager
and provide a SMR algorithm without the need for DU level management
structures. I don't use many conventional lists and I have a proprietary

storage manager that does not use chains.

Most of my PLO operations are much simpler.
I would love to test this algorithm against any other SMR algorithm.
My point has been and remains, that PLO can be efficiently used to
serialize any list in a lock-free manner and even if it does take more
CPU

Re: Serialization without Enque

2013-11-14 Thread Kenneth Wilkerson
I'm with you on patents. I came across an IBM patent yesterday that was
dated 2009  describing a lock-free storage manager using cell pools  to
manage variable length storage. I invented my first one 30 years ago using
CAS. It was writing a more sophisticated version 10 years ago that led to my
research on PLO.  I'm now on my fourth version of this storage manager, this
one I wrote for me,  and it's much more sophisticated than the patented
algorithm with numerous more capabilities. 

Like music, every piece of software is based on what we've seen before.  So
my real question with software patents is how do you prove it's original?
It's the chicken and the egg. Which came first?

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of David Crayford
Sent: Thursday, November 14, 2013 6:40 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 14/11/2013 12:23 AM, Kenneth Wilkerson wrote:
 If I read the article you sent correctly, this algorithm is using a 
 spin lock. It has provision for implementing a lock-free algorithm but 
 none of those are detailed.

Most of the shared_ptr implementations in the wild, including z/OS, use a
lock-free algorithm invented by Alexander Terekhov (IBM). An atomic
reference counted smart pointer is a nice tool to have in your bag.

 PLO has no restrictions other than the limits of my imagination.

Or patents! I notice IBM have quite a few wrt PLO. This one for example
http://www.google.com.mx/patents/US6636883. The US patent office seems to
like patents related to multi-processing. Maybe they think it's novel.

Software patents have a funky smell!


 However, in the last sentence of the performance section, the authors 
 state Note that we make no claim that the atomic access approach 
 unconditionally outperforms the others, merely that it may be best for 
 certain scenarios that can be encountered in practice.. I think this 
 is the key to lock-free implementations. I have generalized primitive 
 functions to perform most of my PLO operations. But I design and tune 
 each to the specific use. I understand the desire to provide 
 generalized functionality for high level language use. However, I do 
 not accept the premise that all lists are the same. And I would 
 certainly use different algorithms for lists that I expected to get 
 large and lists that have more than a small percentage of update
operations.

While it may not be true on z/OS, most software developers these days use
high level languages for multi-threaded applications and prefer abstractions
because they are easy to use and reason about. Of course, that doesn't mean
that they shouldn't understand what's happening under the covers.  The trick
is keeping the optimizer out of the mix which is where inline assembler
comes in handy.

There are many high-quality (free) libraries for lock-free data structures
that are relatively easy to port to z/OS. Using advanced language features
it's quite simple to configure a highly granular concurrent queue by using
policies. The difficult part is testing the bloody things!

typedef mpmc_queue fixed_array, lock_free, smart_ptr  multi_queue; typedef
spsc_queue priority_list, mutex, unique_ptr  blocking_queue; typedef
thread_pool multi_queue, max_threads8  task_pool;

socket_servertask_pool server;


 Kenneth
 --
 --
 


 -Original Message-
 From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
 On Behalf Of David Crayford
 Sent: Tuesday, November 12, 2013 11:45 PM
 To: IBM-MAIN@LISTSERV.UA.EDU
 Subject: Re: Serialization without Enque

 On 13/11/2013 12:34 PM, Kenneth Wilkerson wrote:
 Actually, the algorithm performs well for read-often, write-rarely 
 list because the active chain count does not change and therefore 
 there are relatively infrequent re-drives.  The active chain count 
 only changes on an add or delete. So if there are infrequent adds and 
 deletes, there will be infrequent re-drives. And you are wrong, 
 readers will not contend unless two or more tasks are referencing the 
 exact same element simultaneously. And even then, the re-drive only
 involves the update to the use count.

 Thanks,  I get it now. Maybe IBM should have used PLO for the z/OS C++ 
 shared_ptr SMR algorithm which has both weak/strong reference counts + 
 the pointer. They use a lock-free algorithm using CS 
 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2674.htm.
 shared_ptr implements proxy based garbage collection.

 There are a lot of optimization that I do not describe in this 
 algorithm for simplicity. For example, when you do a PLO DCAS, the 
 condition code is set to indicate which compare failed. This can be 
 used
 to optimize the re-drive.
 There are many others optimizations with re-drives to make this more 
 efficient. And if you get away from traditional lists, there are even

Re: Serialization without Enque

2013-11-14 Thread Mike Schwab
http://www.techdirt.com/articles/20130723/09543223903/joel-spolsky-stackexchange-thwarts-broad-microsoft-patent-app-using-microsofts-own-prior-art.shtml

Submit Prior Art to U.S. Patent office Appilications
http://patents.stackexchange.com/

On Thu, Nov 14, 2013 at 7:19 AM, Kenneth Wilkerson
redb...@austin.rr.com wrote:
 I'm with you on patents. I came across an IBM patent yesterday that was
 dated 2009  describing a lock-free storage manager using cell pools  to
 manage variable length storage. I invented my first one 30 years ago using
 CAS. It was writing a more sophisticated version 10 years ago that led to my
 research on PLO.  I'm now on my fourth version of this storage manager, this
 one I wrote for me,  and it's much more sophisticated than the patented
 algorithm with numerous more capabilities.

 Like music, every piece of software is based on what we've seen before.  So
 my real question with software patents is how do you prove it's original?
 It's the chicken and the egg. Which came first?

 Kenneth



-- 
Mike A Schwab, Springfield IL USA
Where do Forest Rangers go to get away from it all?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-14 Thread Shmuel Metz (Seymour J.)
In 5284c4bc.5000...@gmail.com, on 11/14/2013
   at 08:40 PM, David Crayford dcrayf...@gmail.com said:

Or patents! I notice IBM have quite a few wrt PLO.

Could they be defensive patents? The only really effective way to
prevent someone else from patenting a technique that you're using is
to file first. Unless and until the USPTO gets serious about excluding
things that are obvious to a practitioner or are prior art, I expect
to continue to see a flood of defensive patents, and not just from
IBM.

Note that even if the USPTO rejects a defensive patent as prior art,
it will have served its purpose.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-14 Thread Tony Harminc
On 14 November 2013 09:26, Shmuel Metz (Seymour J.)
shmuel+ibm-m...@patriot.net wrote:
at 08:40 PM, David Crayford dcrayf...@gmail.com said:

Or patents! I notice IBM have quite a few wrt PLO.

 Could they be defensive patents? The only really effective way to
 prevent someone else from patenting a technique that you're using is
 to file first.

If you don't want to patent, but also don't want others to patent your
invention, you can publish in a place that gets reasonable exposure,
such as an academic journal in the field. Or if you're big enough
(like IBM), you can publish your own Technical Disclosure Journal just
for this purpose. Of course if you want the additional benefit of
having your competition not know of your publication, then you need to
publish in the most obscure journal in the most obscure language that
you can away with while still claiming that it's prior art. IIRC there
was a case in the 1980s or 90s where a US company published in Chinese
in the Acta Chimica Sinica, which the USPTO then claimed wasn't likely
enough to ever be seen by US researchers. Subsequently Microsoft and
other software companies have done the same, but since then it's
become more reasonable to expect that Chinese journals will be
noticed.

Meanwhile, IBM has discontinued its TDJ, perhaps for other reasons,
but perhaps because they're being more aggressive with patents as a
matter of policy.

 Unless and until the USPTO gets serious about excluding
 things that are obvious to a practitioner or are prior art, I expect
 to continue to see a flood of defensive patents, and not just from
 IBM.

IBM has patented a number of new instructions straight out of the
Principles of Operation, with no attempt to make the claims general.
Presumably this is just to make life harder for would-be suppliers of
competitive hardware. For example Move with Optional Specifications
got US patent 20070277014, although the claims are just a particular
application of the age-old notion of calling a subroutine with
arguments and doing different things based on those arguments. I can't
imagine it standing up to a challenge, but that costs a lot of money,
and when you have hundreds of instructions to patent...

 Note that even if the USPTO rejects a defensive patent as prior art, it will 
 have served its purpose.

Indeed, and it may perhaps be cheaper than other avenues.

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-14 Thread Shmuel Metz (Seymour J.)
In
CAArMM9S8B0_nTv3XiVPv1JLtw2=HfMH5m7sNCFFo6Kd4=mAA=g...@mail.gmail.com,
on 11/14/2013
   at 03:26 PM, Tony Harminc t...@harminc.net said:

If you don't want to patent, but also don't want others to patent
your invention, you can publish in a place that gets reasonable
exposure,

The point of a defensive patent is to avoid expensive litigation, not
to win it. Unless the USPTO searches the journal as part of processing
patent applications, it just doesn't have the same practical effect as
filing for a patent.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-13 Thread Shmuel Metz (Seymour J.)
In 001801cee029$af0d2bc0$0d278340$@austin.rr.com, on 11/12/2013
   at 10:34 PM, Kenneth Wilkerson redb...@austin.rr.com said:

Actually, the algorithm performs well for read-often, write-rarely
list because the active chain count does not change and therefore
there are relatively infrequent re-drives.

What happens if the is an intervening add and also an intervening
remove, leaving no net change in the active chain count even though
the chain itself has changed?
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-13 Thread Kenneth Wilkerson
The active count is incremented for every add and delete. It is never
decremented so any update would result in a change.  In my actual
algorithms, all my lists are in shared or common memory objects so all the
pointers are 64 bit and I use +2 variations on the PLO. In this case, I use
a counter with 2 full words on a double word boundary. The first full word
is the change count and it's always incremented. The second full word is
element count and it's is incremented for each add and decremented for each
deleted. I load the counter with a LG and then use ALG and AL or SL to
manipulate the high or low word.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Shmuel Metz (Seymour J.)
Sent: Wednesday, November 13, 2013 7:29 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

In 001801cee029$af0d2bc0$0d278340$@austin.rr.com, on 11/12/2013
   at 10:34 PM, Kenneth Wilkerson redb...@austin.rr.com said:

Actually, the algorithm performs well for read-often, write-rarely list 
because the active chain count does not change and therefore there are 
relatively infrequent re-drives.

What happens if the is an intervening add and also an intervening remove,
leaving no net change in the active chain count even though the chain itself
has changed?
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-13 Thread Kenneth Wilkerson
If I read the article you sent correctly, this algorithm is using a spin
lock. It has provision for implementing a lock-free algorithm but none of
those are detailed. There are certainly cases where spin-locks work very
effectively particularly if the thread finds a lock is unavailable and it
voluntarily relinquishes control. I try to avoid spin-locks because if they
are performed in a high priority application they can cause detrimental
system effects. The applications I write are system level and can be
executed from anywhere meaning that applications that use it may be in a
locked stated, in an SRB, a high priority, etc thus meaning that it behooves
me to only use hardware provided methods for serialization. This is the
primary reason I use PLO. PLO has no restrictions other than the limits of
my imagination.

However, in the last sentence of the performance section, the authors state
Note that we make no claim that the atomic access approach unconditionally
outperforms the others, merely that it may be best for certain scenarios
that can be encountered in practice.. I think this is the key to lock-free
implementations. I have generalized primitive functions to perform most of
my PLO operations. But I design and tune each to the specific use. I
understand the desire to provide generalized functionality for high level
language use. However, I do not accept the premise that all lists are the
same. And I would certainly use different algorithms for lists that I
expected to get large and lists that have more than a small percentage of
update operations.

Kenneth




-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of David Crayford
Sent: Tuesday, November 12, 2013 11:45 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 13/11/2013 12:34 PM, Kenneth Wilkerson wrote:
 Actually, the algorithm performs well for read-often, write-rarely 
 list because the active chain count does not change and therefore 
 there are relatively infrequent re-drives.  The active chain count 
 only changes on an add or delete. So if there are infrequent adds and 
 deletes, there will be infrequent re-drives. And you are wrong, 
 readers will not contend unless two or more tasks are referencing the 
 exact same element simultaneously. And even then, the re-drive only
involves the update to the use count.

Thanks,  I get it now. Maybe IBM should have used PLO for the z/OS C++
shared_ptr SMR algorithm which has both weak/strong reference counts + the
pointer. They use a lock-free algorithm using CS
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2674.htm. 
shared_ptr implements proxy based garbage collection.

 There are a lot of optimization that I do not describe in this 
 algorithm for simplicity. For example, when you do a PLO DCAS, the 
 condition code is set to indicate which compare failed. This can be used
to optimize the re-drive.
 There are many others optimizations with re-drives to make this more 
 efficient. And if you get away from traditional lists, there are even 
 more optimizations.

Like what? What do you replace linked lists with, arrays with offsets
instead of pointers?

 Honestly, I provided this algorithm after reading the paper on hazard 
 pointers. The paper was written in 2002 and claimed there was no 
 atomic DCAS when PLO DCAS  became available in 2001. So I took a much 
 simpler algorithm that I had and modified it to use a use count to 
 accommodate traditional storage managers to prove that a PLO could be 
 used to manage a conventional list using a traditional storage manager 
 and provide a SMR algorithm without the need for DU level management 
 structures. I don't use many conventional lists and I have a proprietary
storage manager that does not use chains.
 Most of my PLO operations are much simpler.

 I would love to test this algorithm against any other SMR algorithm. 
 My point has been and remains, that PLO can be efficiently used to 
 serialize any list in a lock-free manner and even if it does take more 
 CPU this will be offset by increased throughput.

 And just because UNIX has issues with PLO doesn't mean the issue is 
 with PLO...

UNIX doesn't have an issue with PLO. It clearly states that popping/pushing
elements at the beginning of the queue is a good performer. Surely your
algorithm would have the same problem if multiple producers/consumers were
inserting/removing elements from the middle of a long list.

 Kenneth

 -Original Message-
 From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
 On Behalf Of David Crayford
 Sent: Tuesday, November 12, 2013 8:39 PM
 To: IBM-MAIN@LISTSERV.UA.EDU
 Subject: Re: Serialization without Enque

 Thanks for sharing your design Ken. It seems to me that PLO is best 
 used for data structures like double-ended queues where elements can 
 be inserted/removed

Re: Serialization without Enque

2013-11-12 Thread John Gilmore
Let's begin with queue-handling 101.  A queue is a FIFO list.  One
adds elements to the rear of a queue and removes them from the front.

Tony Harminc's implicit point, that these two operations can be
serialized and done unproblematically, is correct beyond argument.
There is never a requirement to use fallible, probabilistic methods to
do so.

This said, it is all but certain that Jon Perryman is trying to
address OTHER problems too.  It is clear from what he has said that he
must sometimes find and remove an element that is not necessarily at
the front of his queue.

How is such an element identified?  By address?  By key?   By a
sequence number of timestamp?

What other kinds of processing does he need to do?  Does he, for
example, sometimes need to insert an element at some position other
than the rear of this queue?  [Moving elements in the queue is
equivalent to deleting them from one position and reinserting them
elsewhere.]

Once his functional requirements are understood control blocks and
non-probabilistic code to meet them will be easy enough to devise.

The appropriate use of pointers to pointers, for example, greatly
simplifies many queue-management problems, making it unnecessary to
distinguish removal operations positionally.  One need not care
whether one is removing an element at the front, rear, or in some
intermediate position in a queue.  Moreover, one should not.  In
addition to being inelegant/messy special casing is error-prone.

We thus need more information if we are to be helpful.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-12 Thread Kenneth Wilkerson
I use cell pools. I also use a proprietary storage manager that doesn't use
chains. These methodology offer me capabilities well beyond those found in
traditional methods. Much of what I do is based on these capabilities, but
the algorithms could easily be adapted to use a conventional storage manager
that uses chains. 

Here is an algorithm I use that  I've adopted for traditional storage
management. This algorithm will work for any list; LIFO,  FIFO, ordered or
other , and for deletion from the head, middle or tail. 

Setup: Allocate a word in each element. Bit 0 is one for all active
elements. Bit 1 is one for all elements pending deletion. Bit 2 is `reserved
for an extension to this algorithm (such as garbage cleanup). The remaining
bits are a use count allowing for many more DUs than are supported by MVS.

Note.: When PLO Compare and Swap (CS) or Double Compare and Swap (DCAS) is
referenced, the PLO uses the use count address as the lock word. This will
serialize all updates to the use counter for that element. For the PLO
Compare and Loads or PLO update, the lock word is the active chain counter.

To search:
Step 1:  Use the PLO Compare and Load on the active chain counter to search
the chain as before.  If the PLO fails, re-drive the search.

Step 2:  Before examining the element, increment the use count with a PLO
Double Compare and Swap (DCAS). Load the first register pair with the
current chain counter. The swap value will also be the current chain
counter. Essentially, we’re using the active chain count to serialize
increments  to the use count to avoid accessing an area that may have been
released. The second register pair will contain the current use count with a
swap value incremented by 1 using an AL to avoid resetting the high bit. If
the PLO DCAS fails, the previous PLO Compare and Load (Step 1) should be
re-driven.

Step 3: Use the PLO Compare and Load for the next element. Save the PLO
status and decrement the use count with a SL using PLO CS. We don't need a
DCAS because the use count is not 0 and this element can't be deleted.  If
this PLO CS fails, re-drive it. If PLO Compare and Load status is re-drive,
then before re-driving the search, check the use count (in the register used
to update it). If  Bit 1 (pending delete) is set and the use count is 0,
this task can release it.

To delete:  (this assumes the deleting task has already updated the use
count in the process of finding the element to delete)
Step1: Use a PLO update function to remove the element from the active chain
to avoid any future references. 

Step2:  If the PLO update  to remove the element fails,  decrement the use
count but do NOT set bit 1 (pending delete) using a PLO CS. If the PLO CS
fails, re-drive it . 

Step 3: If the PLO update to remove the element succeeds, decrement the use
count,  SET bit 1 (pending delete) and RESET Bit 0 (active bit) using a PL
CS. If the PLO CS fails, re-drive it .
 
Step 4: Whether the PLO update succeeded or failed, check the use count in
the register used to update it:
 If bit 1 (pending delete) is set and the use count is
not 0, then this task should exit . 
  If bit 1 (pending delete) is set and the use count is
0, then this task can release it.
 Otherwise, this task must re-drive the search to find
the element to be deleted   

You can work out the various scenarios yourself. But because the count is
incremented/decremented after a PLO Compare and Load or update, the status
of the PLO provides a decision point on whether an element may have been
deleted. Using the use count address as the lock word insures that all use
count updates for a specific use count occur serially. There are numerous
extensions to this algorithm that are more than I want to describe. Things
like adding pending deletes to a delete chain or having an asynchronous,
periodic garbage collection task handle the deletes.  

Kenneth
-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Monday, November 11, 2013 9:38 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Could you provide an insight to a design that would handle the situation
where an SSI program with non-trivial workload uses a very large list. This
list is normally semi-static but there can be periods of time where entries
are heavily deleted and added. Not freeing storage is an option because it
could be a significant amount of storage. 

Thanks, Jon Perryman.




 From: Tony Harminc t...@harminc.net
To: IBM-MAIN@LISTSERV.UA.EDU
Sent: Monday, November 11, 2013 7:07 PM
Subject: Re: Serialization without Enque
 

On 11 November 2013 20:15, Jon Perryman jperr...@pacbell.net wrote:
      L    R2,QUEUE
      L    R3,NEXT_ENTRY
     CS  R2,R3,QUEUE                    New queue head  While this 
seems bullet proof, it's not. If there is a long

Re: Serialization without Enque

2013-11-12 Thread Shmuel Metz (Seymour J.)
In 1384218943.19861.yahoomail...@web181001.mail.ne1.yahoo.com, on
11/11/2013
   at 05:15 PM, Jon Perryman jperr...@pacbell.net said:

Take for example:
     L    R2,QUEUE
     L    R3,NEXT_ENTRY
    CS  R2,R3,QUEUE                    New queue head

I'm not sure what you're trying to do. Did you mean this?

LOOP     L     R2,QUEUE
     L     R3,NEXT_ENTRY
 STR2,0(,R3)   Forward link
     CS    R2,R3,QUEUE         New queue head
 BNE   LOOP

While this seems bullet proof, it's not. If there is a long delay
between between the L instructions then next entry could have been
freed causing an abend.

It won't cause an ABEND in the code that you showed, and it is just as
serious an error to free it *after* you add it to the queue. Or did
you mean this?

LOOP     L     R2,QUEUE
 USING ENTRY,R2
     L     R3,NEXT_ENTRY
 DROP  R2
     CS    R2,R3,QUEUE         New queue head
 BNE   LOOP

As with adding an entry, freeing the storage inappropriately is an
error when the entry is still on the queue, not just during the CS
loop.

You need to think about the small details.

Not just the serialization.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-12 Thread Jon Perryman
Thanks Kenneth for the explanation. It's really good stuff. All the 
serialization that I've seen was prior to the PLO instruction. I now see that 
it has certainly improved our ability to serialize. 

Jon Perryman.




 From: Kenneth Wilkerson redb...@austin.rr.com


I use cell pools. I also use a proprietary storage manager that doesn't use
chains. These methodology offer me capabilities well beyond those found in
traditional methods. Much of what I do is based on these capabilities, but
the algorithms could easily be adapted to use a conventional storage manager
that uses chains. 


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-12 Thread Jon Perryman
The example was simply to show a point without using an involved sample. At a 
minimum, I should have included a USING for the queue entry. Sorry for the 
confusion.

Jon Perryman.




 From: Shmuel Metz (Seymour J.) shmuel+ibm-m...@patriot.net


In 1384218943.19861.yahoomail...@web181001.mail.ne1.yahoo.com, on
11/11/2013
   at 05:15 PM, Jon Perryman jperr...@pacbell.net said:

Take for example:
     L    R2,QUEUE
     L    R3,NEXT_ENTRY
    CS  R2,R3,QUEUE                    New queue head

I'm not sure what you're trying to do. Did you mean this?


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-12 Thread David Crayford
 [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Monday, November 11, 2013 9:38 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Could you provide an insight to a design that would handle the situation
where an SSI program with non-trivial workload uses a very large list. This
list is normally semi-static but there can be periods of time where entries
are heavily deleted and added. Not freeing storage is an option because it
could be a significant amount of storage.

Thanks, Jon Perryman.





From: Tony Harminc t...@harminc.net
To: IBM-MAIN@LISTSERV.UA.EDU
Sent: Monday, November 11, 2013 7:07 PM
Subject: Re: Serialization without Enque


On 11 November 2013 20:15, Jon Perryman jperr...@pacbell.net wrote:

   LR2,QUEUE
   LR3,NEXT_ENTRY
  CS  R2,R3,QUEUENew queue head  While this
seems bullet proof, it's not. If there is a long delay between between
the L instructions then next entry could  have been freed causing an

abend.

If your code is referencing an item that may have been freed, then your
design is wrong.


While the delay is very unlikely, it is still a possiblity (e.g.
swapout after first L). This example is just removing the first entry.

If you try to remove an entry in the middle of the chain, you run a larger
risk.

Some of the techniques we use to minimize risk:
1. Abend recovery
2. don't overload the entry - only contains shared information.
3. Place entry on a timeout queue before placing on a free queue or

freeing.

4. Copy the entry to workarea - reduces reference time of actual
entry 5. Eyecatchers that can be verified.
6. Unique identifier for each entry that can be verified.
7. Turn off interrupts.

Most of these are indicative of incorrect design rather than of careful
attention to risk. If it's unclear whether a queue entry is eligible to
be worked on or even to be looked at unless you do it in a hurry, then
there is something quite wrong.

By all means provide abend recovery, but don't make it part of the
queue handling design.

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send
email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN




--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-12 Thread Kenneth Wilkerson
Actually, the algorithm performs well for read-often, write-rarely list
because the active chain count does not change and therefore there are
relatively infrequent re-drives.  The active chain count only changes on an
add or delete. So if there are infrequent adds and deletes, there will be
infrequent re-drives. And you are wrong, readers will not contend unless two
or more tasks are referencing the exact same element simultaneously. And
even then, the re-drive only involves the update to the use count. 

There are a lot of optimization that I do not describe in this algorithm for
simplicity. For example, when you do a PLO DCAS, the condition code is set
to indicate which compare failed. This can be used to optimize the re-drive.
There are many others optimizations with re-drives to make this more
efficient. And if you get away from traditional lists, there are even more
optimizations.

Honestly, I provided this algorithm after reading the paper on hazard
pointers. The paper was written in 2002 and claimed there was no atomic DCAS
when PLO DCAS  became available in 2001. So I took a much simpler algorithm
that I had and modified it to use a use count to accommodate traditional
storage managers to prove that a PLO could be used to manage a conventional
list using a traditional storage manager and provide a SMR algorithm without
the need for DU level management structures. I don't use many conventional
lists and I have a proprietary storage manager that does not use chains.
Most of my PLO operations are much simpler. 

I would love to test this algorithm against any other SMR algorithm. My
point has been and remains, that PLO can be efficiently used to serialize
any list in a lock-free manner and even if it does take more CPU this will
be offset by increased throughput.

And just because UNIX has issues with PLO doesn't mean the issue is with
PLO...

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of David Crayford
Sent: Tuesday, November 12, 2013 8:39 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Thanks for sharing your design Ken. It seems to me that PLO is best used for
data structures like double-ended queues where elements can be
inserted/removed from both ends of the queue atomically. In the case of a
read-often-write-rarely list with multiple readers that traverse the list it
doesn't seem optimal. Correct me if I'm wrong but readers will contend with
each other.

FYI, z/OS UNIX message queues can be configured to use PLO (with fallback to
a latch). The documentation states that inserting/removing from the middle
of a long list using PLO is a poor performer
http://pic.dhe.ibm.com/infocenter/zos/v1r12/topic/com.ibm.zos.r12.bpxb100/qc
t.htm#qct. 


There isn't a one-size-fits-all solution. It depends on the usage patterns.
Hopefully HTM will solve that. Depending on how Intels Haswell TSX
implementation performs in the wild we could see HTM in our cell phones as
early as next year.


On 12/11/2013 11:18 PM, Kenneth Wilkerson wrote:
 I use cell pools. I also use a proprietary storage manager that 
 doesn't use chains. These methodology offer me capabilities well 
 beyond those found in traditional methods. Much of what I do is based 
 on these capabilities, but the algorithms could easily be adapted to 
 use a conventional storage manager that uses chains.

 Here is an algorithm I use that  I've adopted for traditional storage 
 management. This algorithm will work for any list; LIFO,  FIFO, 
 ordered or other , and for deletion from the head, middle or tail.

 Setup: Allocate a word in each element. Bit 0 is one for all active 
 elements. Bit 1 is one for all elements pending deletion. Bit 2 is 
 `reserved for an extension to this algorithm (such as garbage 
 cleanup). The remaining bits are a use count allowing for many more DUs
than are supported by MVS.

 Note.: When PLO Compare and Swap (CS) or Double Compare and Swap 
 (DCAS) is referenced, the PLO uses the use count address as the lock 
 word. This will serialize all updates to the use counter for that 
 element. For the PLO Compare and Loads or PLO update, the lock word is the
active chain counter.

 To search:
 Step 1:  Use the PLO Compare and Load on the active chain counter to 
 search the chain as before.  If the PLO fails, re-drive the search.

 Step 2:  Before examining the element, increment the use count with a 
 PLO Double Compare and Swap (DCAS). Load the first register pair with 
 the current chain counter. The swap value will also be the current 
 chain counter. Essentially, we're using the active chain count to 
 serialize increments  to the use count to avoid accessing an area that 
 may have been released. The second register pair will contain the 
 current use count with a swap value incremented by 1 using an AL to 
 avoid resetting the high bit. If the PLO DCAS fails, the previous PLO 
 Compare and Load (Step 1) should be re-driven

Re: Serialization without Enque

2013-11-12 Thread David Crayford

On 13/11/2013 12:34 PM, Kenneth Wilkerson wrote:

Actually, the algorithm performs well for read-often, write-rarely list
because the active chain count does not change and therefore there are
relatively infrequent re-drives.  The active chain count only changes on an
add or delete. So if there are infrequent adds and deletes, there will be
infrequent re-drives. And you are wrong, readers will not contend unless two
or more tasks are referencing the exact same element simultaneously. And
even then, the re-drive only involves the update to the use count.


Thanks,  I get it now. Maybe IBM should have used PLO for the z/OS C++ 
shared_ptr SMR algorithm which has both weak/strong reference counts + 
the pointer. They use a lock-free algorithm using CS 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2674.htm. 
shared_ptr implements proxy based garbage collection.



There are a lot of optimization that I do not describe in this algorithm for
simplicity. For example, when you do a PLO DCAS, the condition code is set
to indicate which compare failed. This can be used to optimize the re-drive.
There are many others optimizations with re-drives to make this more
efficient. And if you get away from traditional lists, there are even more
optimizations.


Like what? What do you replace linked lists with, arrays with offsets 
instead of pointers?



Honestly, I provided this algorithm after reading the paper on hazard
pointers. The paper was written in 2002 and claimed there was no atomic DCAS
when PLO DCAS  became available in 2001. So I took a much simpler algorithm
that I had and modified it to use a use count to accommodate traditional
storage managers to prove that a PLO could be used to manage a conventional
list using a traditional storage manager and provide a SMR algorithm without
the need for DU level management structures. I don't use many conventional
lists and I have a proprietary storage manager that does not use chains.
Most of my PLO operations are much simpler.



I would love to test this algorithm against any other SMR algorithm. My
point has been and remains, that PLO can be efficiently used to serialize
any list in a lock-free manner and even if it does take more CPU this will
be offset by increased throughput.

And just because UNIX has issues with PLO doesn't mean the issue is with
PLO...


UNIX doesn't have an issue with PLO. It clearly states that 
popping/pushing elements at the beginning of the queue is a good 
performer. Surely your algorithm would have the same problem if multiple 
producers/consumers were inserting/removing elements from the middle of 
a long list.



Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of David Crayford
Sent: Tuesday, November 12, 2013 8:39 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Thanks for sharing your design Ken. It seems to me that PLO is best used for
data structures like double-ended queues where elements can be
inserted/removed from both ends of the queue atomically. In the case of a
read-often-write-rarely list with multiple readers that traverse the list it
doesn't seem optimal. Correct me if I'm wrong but readers will contend with
each other.

FYI, z/OS UNIX message queues can be configured to use PLO (with fallback to
a latch). The documentation states that inserting/removing from the middle
of a long list using PLO is a poor performer
http://pic.dhe.ibm.com/infocenter/zos/v1r12/topic/com.ibm.zos.r12.bpxb100/qc
t.htm#qct.


There isn't a one-size-fits-all solution. It depends on the usage patterns.
Hopefully HTM will solve that. Depending on how Intels Haswell TSX
implementation performs in the wild we could see HTM in our cell phones as
early as next year.


On 12/11/2013 11:18 PM, Kenneth Wilkerson wrote:

I use cell pools. I also use a proprietary storage manager that
doesn't use chains. These methodology offer me capabilities well
beyond those found in traditional methods. Much of what I do is based
on these capabilities, but the algorithms could easily be adapted to
use a conventional storage manager that uses chains.

Here is an algorithm I use that  I've adopted for traditional storage
management. This algorithm will work for any list; LIFO,  FIFO,
ordered or other , and for deletion from the head, middle or tail.

Setup: Allocate a word in each element. Bit 0 is one for all active
elements. Bit 1 is one for all elements pending deletion. Bit 2 is
`reserved for an extension to this algorithm (such as garbage
cleanup). The remaining bits are a use count allowing for many more DUs

than are supported by MVS.

Note.: When PLO Compare and Swap (CS) or Double Compare and Swap
(DCAS) is referenced, the PLO uses the use count address as the lock
word. This will serialize all updates to the use counter for that
element. For the PLO Compare and Loads or PLO update, the lock word is the

active chain counter.

To search:
Step 1:  Use

Re: Serialization without Enque

2013-11-11 Thread John Gilmore
What are bakery style ticketing algorithms?  Take-a-number schemes?

Googling bakery style landed me among recipes/receipts for
chocolate-chip cookies and cupcakes with pink icing.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread Donald Likens
My control blocks are actually a queue. So it my understanding from reading 
these updates (Thank You!) that all I really need is autonomic functions (CS or 
PLO) to update the queue.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread Peter Relson
suggests that Hardware Transaction Memory may not be the panacea we all 
expect it to be, and in some cases may actually increase CPU 

Of course it's true that if a transaction experiences too much contention 
and resorts to its fallback path, you have used more CPU than if you went 
directly to the fallback path. That is specifically why every use of 
non-constrained transactions ought to do analysis to determine if it is 
even theoretically beneficial.

What I don't see mentioned in the article is zEC12's constrained 
transactions. By their very definition they need no fallback path.  That 
is a huge benefit both in terms of complexity and development/test cost.

In PLO, the hardware locking occurs according to the lock word. 

You seem to be assuming that PLO implementation actually is truly locked 
according to the individual lock word. Maybe it is now. It definitely did 
not used to be. The machine would decide how to map the 
individually-specified lock word to (limited) hardware resources that were 
the true serialization mechanism. It was not necessarily one to one.

Transaction Memory sounds exciting but it's complex. IBM should put a 
layer of abstraction on top with simple semantics.

Maybe it's me, but I don't really find TBEGIN...TEND complex compared to 
other serializing techniques even when you factor in PPI while counting 
the number of attempts before taking the fallback path. The instructions 
within a transaction are typically less complex than the instructions you 
would need without a transaction, if you could even accomplish what you're 
trying to do outside of a transaction. For example, there is no need for 
CS, PLO. Just more straightforward compare, store, etc.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread Shmuel Metz (Seymour J.)
In 8451960728331713.wa.markmzelden@listserv.ua.edu, on
11/10/2013
   at 03:19 PM, Mark Zelden m...@mzelden.com said:

I've not been paying that close of attention, but I'm more curious
about what people did for these situations prior to PLO.

CS. In some cases PLO can be more natural.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread David Crayford
 On 11 Nov 2013, at 8:22 pm, John Gilmore jwgli...@gmail.com wrote:
 
 What are bakery style ticketing algorithms?  Take-a-number schemes?
 

http://en.m.wikipedia.org/wiki/Lamport's_bakery_algorithm


 Googling bakery style landed me among recipes/receipts for
 chocolate-chip cookies and cupcakes with pink icing.
 
 John Gilmore, Ashland, MA 01721 - USA
 
 --
 For IBM-MAIN subscribe / signoff / archive access instructions,
 send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread DASDBILL2
Small problem with the url:  the bold, blue, underlining does not extend to  
the end of the url.  Copy the  whole character string and past into your 
web browser instead of simply clicking on the bold, blue, underlined part of 
the url. 
  
Bill Fairchild 
Franklinm, TN 
  
  

- Original Message -

From: David Crayford dcrayf...@gmail.com 
To: IBM-MAIN@LISTSERV.UA.EDU 
Sent: Monday, November 11, 2013 7:22:38 AM 
Subject: Re: Serialization without Enque 

 On 11 Nov 2013, at 8:22 pm, John Gilmore jwgli...@gmail.com wrote: 
 
 What are bakery style ticketing algorithms?  Take-a-number schemes? 
 

http://en.m.wikipedia.org/wiki/Lamport's_bakery_algorithm 


 Googling bakery style landed me among recipes/receipts for 
 chocolate-chip cookies and cupcakes with pink icing. 
 
 John Gilmore, Ashland, MA 01721 - USA 
 
 -- 
 For IBM-MAIN subscribe / signoff / archive access instructions, 
 send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN 

-- 
For IBM-MAIN subscribe / signoff / archive access instructions, 
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN 


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread Scott Ford
Can you guys tell me what you consider the appropriate / safe way to enqueue-de 
queue a resource being shared

Scott ford
www.identityforge.com
from my IPAD

'Infinite wisdom through infinite means'


 On Nov 11, 2013, at 8:48 AM, DASDBILL2 dasdbi...@comcast.net wrote:
 
 Small problem with the url:  the bold, blue, underlining does not extend to  
 the end of the url.  Copy the  whole character string and past into your 
 web browser instead of simply clicking on the bold, blue, underlined part of 
 the url. 
   
 Bill Fairchild 
 Franklinm, TN 
   
   
 
 - Original Message -
 
 From: David Crayford dcrayf...@gmail.com 
 To: IBM-MAIN@LISTSERV.UA.EDU 
 Sent: Monday, November 11, 2013 7:22:38 AM 
 Subject: Re: Serialization without Enque 
 
 On 11 Nov 2013, at 8:22 pm, John Gilmore jwgli...@gmail.com wrote: 
 
 What are bakery style ticketing algorithms?  Take-a-number schemes?
 
 http://en.m.wikipedia.org/wiki/Lamport's_bakery_algorithm 
 
 
 Googling bakery style landed me among recipes/receipts for 
 chocolate-chip cookies and cupcakes with pink icing. 
 
 John Gilmore, Ashland, MA 01721 - USA 
 
 -- 
 For IBM-MAIN subscribe / signoff / archive access instructions, 
 send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
 
 -- 
 For IBM-MAIN subscribe / signoff / archive access instructions, 
 send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN 
 
 
 --
 For IBM-MAIN subscribe / signoff / archive access instructions,
 send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread David Crayford

On 11/11/2013 9:48 PM, DASDBILL2 wrote:

Small problem with the url:  the bold, blue, underlining does not extend to  
the end of the url.  Copy the  whole character string and past into your
web browser instead of simply clicking on the bold, blue, underlined part of 
the url.
   


I copied that link in my iPhone. Here's the original.

http://en.wikipedia.org/wiki/Lamport%27s_bakery_algorithm



Bill Fairchild
Franklinm, TN
   
   


- Original Message -

From: David Crayford dcrayf...@gmail.com
To: IBM-MAIN@LISTSERV.UA.EDU
Sent: Monday, November 11, 2013 7:22:38 AM
Subject: Re: Serialization without Enque


On 11 Nov 2013, at 8:22 pm, John Gilmore jwgli...@gmail.com wrote:

What are bakery style ticketing algorithms?  Take-a-number schemes?


http://en.m.wikipedia.org/wiki/Lamport's_bakery_algorithm



Googling bakery style landed me among recipes/receipts for
chocolate-chip cookies and cupcakes with pink icing.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread Kenneth Wilkerson
I read the article. It's still based on a CAS which I don't necessarily 
consider simpler than PLO . In this article, it states  Each hazard pointer 
can be written only by its associated thread, but can be read by all threads. 
This is exactly the provision I state in my example. It's a key element of any 
serialization algorithm that allows concurrent lock-free operations. Hazard 
pointers seem like an elegant general solution. I'll have to explore it but 
I've not had issues with any of the solutions I've written using other methods. 
I write a solution appropriate to the usage of the serialized resource. The key 
is a rigid protocol that has to always be followed. 

In my e-mail last night I meant transactional execution facility not 
transactional event facility. I've thoroughly read the POM on this facility but 
have not to date had access to a processor to experiment with it. From my 
reading, the transactional execution facility (TM) , appears to be a way to 
implement almost all requirements for concurrent read/write access to a 
serialized resource. Essentially, from my understanding of the POM, it's a CAS 
on as many operands as necessary within a hardware defined limit. In my very 
complex application, there are a couple scenarios where I cannot use PLO to 
serialize access. I have no problem using multi-step PLO operations for 
serialization as long as the integrity of the resource is guaranteed after each 
step. For example, a delete that consists of a PLO to remove from the active 
chain and a separate PLO to add the removed element to a free chain. However, 
some operations are too complex for a PLO CAS and triple store even if the 
operands are organized in storage such that you can modify 128 bits at a time. 
In these cases, I use a gate word and a spin lock. When available, the gate is 
0. When in use the gate contains identification information for the gate owner. 
I very rarely have to use these. And if I had a TM like transactional execution 
facility, I would replace this spin lock with this facility. Normally, there 
are only a handful of instructions within the gate so this has never caused me 
any problems.

In a senses, all methods, LL/CS, TM, etc. are spin locks. If they don't 
succeed, try, try again. All methods that I know of, the hardware must perform 
a memory serialization function. I use PLO instead of CS not only because of 
the increased functionality such as modifying noncontiguous areas and being 
able to modify up to 4 128 bit areas but also because I believe the  PLO  lock 
word is an advantage. In all hardware serialization methods that I know of, a 
memory serialization function is required during the LL/CS or TM. These 
serializing functions can be expensive. CS is not granular and the 
serialization proceeds without regard to accesses by other CPUs meaning the 
overhead occurs whether the function succeeds or fails. The PLO lock word 
gates access by all processors using the same lock word thus reducing the 
total number of serializations performed by stopping a processor performing a 
PLO using the same lock word. This requires careful selection of lock words and 
use of the same lock word by all processes that read/write to the same  
resource. This advantage can be negated in a queue that has a substantial 
percentage of write operations compared to read operations because write 
operations will necessarily result in a PLO failure. I believe this is the 
disadvantage referred to in your first paper on TM by Paul McKenney. I suspect 
that since IBM is using TM to replace a lock and that in most cases the lock 
was used to serialize storage alterations. In this case, CPU would increase but 
so would throughput. I believe this is a classic example of trading the less 
expensive CPU resource for the more expensive throughput. 

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of David Crayford
Sent: Sunday, November 10, 2013 8:56 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 11/11/2013 10:36 AM, Kenneth Wilkerson wrote:
 I read the article. This article is about transactional event facility 
 introduced in z/EC-12 and not PLO which is an LL/CS. I wish I had access to a 
 z/EC-12 with the transactional event facility to play with it and compare it 
 to PLO. The transactional event facility is much more comprehensive  and not 
 as granular as a PLO. In PLO, the hardware locking occurs according to the 
 lock word.

Transaction Memory sounds exciting but it's complex. IBM should put a layer of 
abstraction on top with simple semantics.

 I've done a lot of testing with PLO. It can increase CPU, particularly in a 
 situation where updates are much higher percentage of the operations. But in 
 all applications that I've tested, it's CPU overhead is offset by higher 
 throughput. In a traditional locking method, tasks end up serializing to the 
 lock.

There are lots

Re: Serialization without Enque

2013-11-11 Thread Kenneth Wilkerson
In PLO, the hardware locking occurs according to the lock word.

The POM is not specific about the lock word other than a transformation
occurs to generate a PLT logical address used to acquire a lock. However,
this does not affect its application. The key point is that 2 or more
processors executing PLO simultaneously can access or alter the values using
a  PLO instruction and the operation will occur as if one PLO followed the
other.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Monday, November 11, 2013 7:09 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

suggests that Hardware Transaction Memory may not be the panacea we all 
expect it to be, and in some cases may actually increase CPU

Of course it's true that if a transaction experiences too much contention
and resorts to its fallback path, you have used more CPU than if you went
directly to the fallback path. That is specifically why every use of
non-constrained transactions ought to do analysis to determine if it is even
theoretically beneficial.

What I don't see mentioned in the article is zEC12's constrained
transactions. By their very definition they need no fallback path.  That is
a huge benefit both in terms of complexity and development/test cost.

In PLO, the hardware locking occurs according to the lock word. 

You seem to be assuming that PLO implementation actually is truly locked
according to the individual lock word. Maybe it is now. It definitely did
not used to be. The machine would decide how to map the
individually-specified lock word to (limited) hardware resources that were
the true serialization mechanism. It was not necessarily one to one.

Transaction Memory sounds exciting but it's complex. IBM should put a 
layer of abstraction on top with simple semantics.

Maybe it's me, but I don't really find TBEGIN...TEND complex compared to
other serializing techniques even when you factor in PPI while counting the
number of attempts before taking the fallback path. The instructions within
a transaction are typically less complex than the instructions you would
need without a transaction, if you could even accomplish what you're trying
to do outside of a transaction. For example, there is no need for CS, PLO.
Just more straightforward compare, store, etc.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread Jon Perryman
Appropriate / safe is in the eye of the beholder. I personally see 
serialization as reducing the risks as much as possible. To eliminate the risk, 
you would have to orphan the entry. There were recommendations of moving to a 
free chain and reusing them. This has risks too because you simply don't know 
when everyone has finished with it. Only maintaining a use count would ensure 
it's no longer in use but that has risks too. We do what we have must to get 
the job done so we make the exposure as small as we possibly can. Take for 
example:
     L    R2,QUEUE
     L    R3,NEXT_ENTRY
    CS  R2,R3,QUEUE                    New queue head
While this seems bullet proof, it's not. If there is a long delay between 
between the L instructions then next entry could have been freed causing an 
abend. While the delay is very unlikely, it is still a possiblity (e.g. swapout 
after first L). This example is just removing the first entry. If you try to 
remove an entry in the middle of the chain, you run a larger risk.

Some of the techniques we use to minimize risk:
1. Abend recovery 

2. don't overload the entry - only contains shared information.
3. Place entry on a timeout queue before placing on a free queue or freeing.
4. Copy the entry to workarea - reduces reference time of actual entry 
5. Eyecatchers that can be verified.
6. Unique identifier for each entry that can be verified.
7. Turn off interrupts.

This is not a complete list. It's only meant to give you an idea of how complex 
the process can be. You need to think about the small details. Even the support 
staff is considered a risk because they may not fully understand the 
serialization requirements you implement.

Jon Perryman.



 From: Scott Ford scott_j_f...@yahoo.com


Can you guys tell me what you consider the appropriate / safe way to 
enqueue-de queue a resource being shared


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread David Crayford

On 11/11/2013 9:09 PM, Peter Relson wrote:

Maybe it's me, but I don't really find TBEGIN...TEND complex compared to
other serializing techniques even when you factor in PPI while counting
the number of attempts before taking the fallback path. The instructions
within a transaction are typically less complex than the instructions you
would need without a transaction, if you could even accomplish what you're
trying to do outside of a transaction. For example, there is no need for
CS, PLO. Just more straightforward compare, store, etc.


I agree, it's certainly much easier but it's still not *simple*. It 
requires several pages in PoPs to describe it, there's a parameter list, 
diagnostic block etc. That's not rocket science but I would prefer a 
compiler or libraries to do the hard work. Maybe I'm asking for too much 
but there is already a draft specification to make transactions quite 
simple from high-level languages 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3341.pdf. Some 
compilers have supported experimental constructs for transactional 
memory  for a while http://gcc.gnu.org/wiki/TransactionalMemory*. *The 
IBM C/C++ compiler already supports HTM instruction primitives as 
built-ins. I'm looking forward to when they implement the standard 
(Intel have already have an experimental implementation of the draft 
standard for Haswell).




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread Tony Harminc
On 11 November 2013 20:15, Jon Perryman jperr...@pacbell.net wrote:
  LR2,QUEUE
  LR3,NEXT_ENTRY
 CS  R2,R3,QUEUENew queue head
 While this seems bullet proof, it's not. If there is a long delay between 
 between the L instructions then next entry could
 have been freed causing an abend.

If your code is referencing an item that may have been freed, then
your design is wrong.

 While the delay is very unlikely, it is still a possiblity (e.g. swapout 
 after first L). This example is just removing the first
 entry. If you try to remove an entry in the middle of the chain, you run a 
 larger risk.

 Some of the techniques we use to minimize risk:
 1. Abend recovery
 2. don't overload the entry - only contains shared information.
 3. Place entry on a timeout queue before placing on a free queue or freeing.
 4. Copy the entry to workarea - reduces reference time of actual entry
 5. Eyecatchers that can be verified.
 6. Unique identifier for each entry that can be verified.
 7. Turn off interrupts.

Most of these are indicative of incorrect design rather than of
careful attention to risk. If it's unclear whether a queue entry is
eligible to be worked on or even to be looked at unless you do it in a
hurry, then there is something quite wrong.

By all means provide abend recovery, but don't make it part of the
queue handling design.

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-11 Thread Jon Perryman
Could you provide an insight to a design that would handle the situation where 
an SSI program with non-trivial workload uses a very large list. This list is 
normally semi-static but there can be periods of time where entries are heavily 
deleted and added. Not freeing storage is an option because it could be a 
significant amount of storage. 

Thanks, Jon Perryman.




 From: Tony Harminc t...@harminc.net
To: IBM-MAIN@LISTSERV.UA.EDU 
Sent: Monday, November 11, 2013 7:07 PM
Subject: Re: Serialization without Enque
 

On 11 November 2013 20:15, Jon Perryman jperr...@pacbell.net wrote:
      L    R2,QUEUE
      L    R3,NEXT_ENTRY
     CS  R2,R3,QUEUE                    New queue head
 While this seems bullet proof, it's not. If there is a long delay between 
 between the L instructions then next entry could
 have been freed causing an abend.

If your code is referencing an item that may have been freed, then
your design is wrong.

 While the delay is very unlikely, it is still a possiblity (e.g. swapout 
 after first L). This example is just removing the first
 entry. If you try to remove an entry in the middle of the chain, you run a 
 larger risk.

 Some of the techniques we use to minimize risk:
 1. Abend recovery
 2. don't overload the entry - only contains shared information.
 3. Place entry on a timeout queue before placing on a free queue or freeing.
 4. Copy the entry to workarea - reduces reference time of actual entry
 5. Eyecatchers that can be verified.
 6. Unique identifier for each entry that can be verified.
 7. Turn off interrupts.

Most of these are indicative of incorrect design rather than of
careful attention to risk. If it's unclear whether a queue entry is
eligible to be worked on or even to be looked at unless you do it in a
hurry, then there is something quite wrong.

By all means provide abend recovery, but don't make it part of the
queue handling design.

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-10 Thread Shmuel Metz (Seymour J.)
In
caarmm9qm+goqco6p3nef5jsy1miorusy_c8ajjjoxhpeiry...@mail.gmail.com,
on 11/07/2013
   at 08:55 PM, Tony Harminc t...@harminc.net said:

It serializes happily against all the CS variations, TS, and the
newer interlocked-update instructions like ASI, LAA, and so on.

It's a bit more complicated than that; the behavior of ASI depends on
whether interlocked-access facility 1 is available. IAC, the point
remains that the user of a synchronization facility needs to
understand the rules and should not expect interoperability between
unrelated facilities, e.g., latches and semaphores.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-10 Thread John Gilmore
From time to time Shmuel makes a seminal, crucial contribution; and

begin extract
the point remains that the user of a synchronization facility needs to
understand the rules and should not expect interoperability between
unrelated facilities, e.g., latches and semaphores.
/end extract

is one of them.

This thread has been a sort of celebration of misconceived expectations.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-10 Thread Shmuel Metz (Seymour J.)
In 20131109.144735.898...@webmail11.dca.untd.com, on 11/09/2013
   at 07:47 PM, esst...@juno.com esst...@juno.com said:

It would be nice if someone would actually post some working code
using a PLO instruction,

It would be even nicer if IBM would add some PLO examples to Appendix
A of PoOps.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-10 Thread Mark Zelden
On Sat, 9 Nov 2013 19:47:35 GMT, esst...@juno.com esst...@juno.com wrote:

I have been reading and following this thread sine PLO is not an instruction I 
use every day.
It would be nice if someone would actually post some working code using a PLO 
instruction, to illustrate how one would add an element to a queue and remove 
an element from a queue.

Paul D'Angelo

I've not been paying that close of attention, but I'm more curious about what 
people did
for these situations prior to PLO.

Mark
--
Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS  
mailto:m...@mzelden.com 
ITIL v3 Foundation Certified 
Mark's MVS Utilities: http://www.mzelden.com/mvsutil.html 
Systems Programming expert at http://search390.techtarget.com/ateExperts/
--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-10 Thread David Crayford

On 11/11/2013 5:19 AM, Mark Zelden wrote:

On Sat, 9 Nov 2013 19:47:35 GMT, esst...@juno.com esst...@juno.com wrote:


I have been reading and following this thread sine PLO is not an instruction I 
use every day.
It would be nice if someone would actually post some working code using a PLO 
instruction, to illustrate how one would add an element to a queue and remove 
an element from a queue.

Paul D'Angelo

I've not been paying that close of attention, but I'm more curious about what 
people did
for these situations prior to PLO.


They used smart algorithms using the atomic instructions they had, like 
RCU http://en.wikipedia.org/wiki/Read-copy-update. It's interesting that 
I have never seen any use of the PLO instruction in the zLinux kernel code.


Paul McKenney, IBMs expert on these things, wrote a good article that 
suggests that Hardware Transaction Memory may not be the panacea we all 
expect it to be, and in some cases may actually increase CPU 
http://paulmck.livejournal.com/31285.html.




Mark
--
Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS
mailto:m...@mzelden.com
ITIL v3 Foundation Certified
Mark's MVS Utilities: http://www.mzelden.com/mvsutil.html
Systems Programming expert at http://search390.techtarget.com/ateExperts/
--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-10 Thread Kenneth Wilkerson
I read the article. This article is about transactional event facility 
introduced in z/EC-12 and not PLO which is an LL/CS. I wish I had access to a 
z/EC-12 with the transactional event facility to play with it and compare it to 
PLO. The transactional event facility is much more comprehensive  and not as 
granular as a PLO. In PLO, the hardware locking occurs according to the lock 
word. 

I've done a lot of testing with PLO. It can increase CPU, particularly in a 
situation where updates are much higher percentage of the operations. But in 
all applications that I've tested, it's CPU overhead is offset by higher 
throughput. In a traditional locking method, tasks end up serializing to the 
lock. 

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of David Crayford
Sent: Sunday, November 10, 2013 6:50 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 11/11/2013 5:19 AM, Mark Zelden wrote:
 On Sat, 9 Nov 2013 19:47:35 GMT, esst...@juno.com esst...@juno.com wrote:

 I have been reading and following this thread sine PLO is not an instruction 
 I use every day.
 It would be nice if someone would actually post some working code using a 
 PLO instruction, to illustrate how one would add an element to a queue and 
 remove an element from a queue.

 Paul D'Angelo
 I've not been paying that close of attention, but I'm more curious 
 about what people did for these situations prior to PLO.

They used smart algorithms using the atomic instructions they had, like RCU 
http://en.wikipedia.org/wiki/Read-copy-update. It's interesting that I have 
never seen any use of the PLO instruction in the zLinux kernel code.

Paul McKenney, IBMs expert on these things, wrote a good article that suggests 
that Hardware Transaction Memory may not be the panacea we all expect it to be, 
and in some cases may actually increase CPU 
http://paulmck.livejournal.com/31285.html.


 Mark
 --
 Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS 
 mailto:m...@mzelden.com ITIL v3 Foundation Certified Mark's MVS 
 Utilities: http://www.mzelden.com/mvsutil.html
 Systems Programming expert at 
 http://search390.techtarget.com/ateExperts/
 --
 For IBM-MAIN subscribe / signoff / archive access instructions, send 
 email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-10 Thread David Crayford

On 11/11/2013 10:36 AM, Kenneth Wilkerson wrote:

I read the article. This article is about transactional event facility 
introduced in z/EC-12 and not PLO which is an LL/CS. I wish I had access to a 
z/EC-12 with the transactional event facility to play with it and compare it to 
PLO. The transactional event facility is much more comprehensive  and not as 
granular as a PLO. In PLO, the hardware locking occurs according to the lock 
word.


Transaction Memory sounds exciting but it's complex. IBM should put a 
layer of abstraction on top with simple semantics.



I've done a lot of testing with PLO. It can increase CPU, particularly in a 
situation where updates are much higher percentage of the operations. But in 
all applications that I've tested, it's CPU overhead is offset by higher 
throughput. In a traditional locking method, tasks end up serializing to the 
lock.


There are lots of lock-free algorithms out there that do very job just 
using a simple CAS. RCU, hazard pointers to name but a few. Hazard 
pointers are interesting in how they deal with ownership 
http://www.research.ibm.com/people/m/michael/podc-2002.pdf (IBM patent 
warning!).


PLO is a fine instruction. It makes it easy to implement lock-free 
multi-producer multi-consumer stacks/queues. I'm interested
how would one use PLO to implement a fair reader/writer lock? I've seen 
some interesting bakery style ticketing algorithms. They-re basically 
spinlocks on steriods.




Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of David Crayford
Sent: Sunday, November 10, 2013 6:50 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 11/11/2013 5:19 AM, Mark Zelden wrote:

On Sat, 9 Nov 2013 19:47:35 GMT, esst...@juno.com esst...@juno.com wrote:


I have been reading and following this thread sine PLO is not an instruction I 
use every day.
It would be nice if someone would actually post some working code using a PLO 
instruction, to illustrate how one would add an element to a queue and remove 
an element from a queue.

Paul D'Angelo

I've not been paying that close of attention, but I'm more curious
about what people did for these situations prior to PLO.

They used smart algorithms using the atomic instructions they had, like RCU 
http://en.wikipedia.org/wiki/Read-copy-update. It's interesting that I have 
never seen any use of the PLO instruction in the zLinux kernel code.

Paul McKenney, IBMs expert on these things, wrote a good article that suggests 
that Hardware Transaction Memory may not be the panacea we all expect it to be, 
and in some cases may actually increase CPU 
http://paulmck.livejournal.com/31285.html.



Mark
--
Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS
mailto:m...@mzelden.com ITIL v3 Foundation Certified Mark's MVS
Utilities: http://www.mzelden.com/mvsutil.html
Systems Programming expert at
http://search390.techtarget.com/ateExperts/
--
For IBM-MAIN subscribe / signoff / archive access instructions, send
email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-09 Thread Donald Likens
I see a flaw in my logic... I need to use PLO to serialize all this time.
--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Serialization without Enque

2013-11-09 Thread esst...@juno.com
I have been reading and following this thread sine PLO is not an instruction I 
use every day.
It would be nice if someone would actually post some working code using a PLO 
instruction, to illustrate how one would add an element to a queue and remove 
an element from a queue.

Paul D'Angelo

 
--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-09 Thread Kenneth Wilkerson

**

*  SOMEONE ELSE HAS UPDATED THE CHAIN. WE NEED TO RETURN THE BUILT

*  ELEMENT TO THE FREE CHAIN AND REDRIVE. REDRIVES ARE VERY INFREQUENT

ADDTOFREE  DS   0H

   XC   ELEMENT_PTRS,ELEMENT_PTRS

   LG   R14,QUEUE_AVCOUNTERS   GET COUNTER FOR AVAILABLE QUEUE

   LGR  R15,R14COPY COUNTER TO AJUST

   LA   R1,QUEUE_AVCOUNTERSCOUNTER IS LOCK WORD

   LG   R2,QUEUE_AVAILQGET CURRENT FREE ELEMENT HEAD

   STG  R2,ELEMENT_NEXTSET CURR HEAD AS NEXT

   ALG  R15,INCCHANGES INCR CHANGES BY 1 HIGH WORD

   AHI  R15,1  INCR ELEMENTS IN FREE BY 1

   LA   R0,10  PLO DOUBLE COMPARE AND SWAP

   PLO  R14,0(R1),R2,QUEUE_AVAILQ  ADD TO FREE ELEMENTS

   JNZ  ADDTOFREE

   JSEARCH_REDRIVE

 

**

*  COME HERE IF ELEMENT FOUND - PROCESS IS DONE

FOUND  DS 0H

*  R3 HAS FOUND ELEMENT

   ...

 

***

*  ERROR ROUTINE FOR OUT OF SPACE

WEREOUTOFSPACE DS 0H

   ...

 

 

HOPEFULLY, I DIDN'T TRANSLATE SOMETHING INCORRECTLY OR MISTYPED IT BUT

THIS IS A WORKING ALGORITHM THAT GETS CALLED THE FIRST TIME ANY DU

REQUESTS ANY SERVICE FROM THIS SERVER. IT'S USED BY TERMINATION

TO INSURE ALL RESOURCES ARE RETURNED WHEN A DU ENDS.

 


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of esst...@juno.com
Sent: Saturday, November 09, 2013 1:48 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Serialization without Enque

I have been reading and following this thread sine PLO is not an instruction
I use every day.
It would be nice if someone would actually post some working code using a
PLO instruction, to illustrate how one would add an element to a queue and
remove an element from a queue.

Paul D'Angelo

 
--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread Peter Relson
applicable to 99%+ of all serialization scenarios you encounter 

To be frank, you might not have very complex serialization requirements. 
Also, using PLO when CS,CSG,CDS,CDSG would do is a significant waste of 
cycles. For the cases I have seen within our code, uses of PLO (in the 
cases where it is not better to use something simpler) are a tiny 
percentage of our serialization needs.

When the updating process wakes up S0C4! 

Using PLO to update a free queue, as is the case with CPOOL and its 
CDS-based free-queue protocol, requires that the queue elements *never* be 
freed (unless you like potentially blowing up or, worse, overlaying 
something you didn't intend to write into). Perhaps this is not well 
understood.

I really don't see the big deal with an 0c4 in this scenario (should 
happen rarely)

That's a scary statement. If you get an 0C4 you could probably deal with 
it. The real risk is the case where you don't get an 0C4 because the 
storage was re-allocated and used for something else and now it does not 
program check but overlays something.

I think I figured out a solution:

There are a lot of details missing in what was shown, but if you want my 
honest suspicion, it's that if this is a free queue type of protocol, it 
will not work.
The reason that the free queue protocol needs a sequence number is because 
even if the header matches, the values that you need to put into your 
new element for the next and/or previous may no longer be correct due 
to a set of additional asynchronous operations.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson
I really don't see the big deal with an 0c4 in this scenario (should
happen rarely)

You misunderstood my point. You could use PLO to serialize a chain even if
the areas are released as they are deleted provided you always use PLO
Compare and Load to load the pointers and recovery sets a retry to re-drive
the process. As long as the count is updated each time you do a delete (and
release), if the delete occurs while some other management function is being
performed, the PLO Compare and Load will force the redrive either by CC or
0c4. If the storage had been reallocated but still in the same key, the PLO
would fail because the count has changed. PLO may fetch operands before the
lock and memory serialization so an 0c4 can occur on the PLO Compare and
Load. Either way, the PLO does not store so there is never an overlay.

I would never design an application to use PLO in this fashion. However, if
I had an existing application, I find this methodology more desirable than
getting a CMS lock everywhere I access the chain. I stand by my statement
that you can serialize 99+% of all scenarios using PLO and that P:LO
serialization is much more desirable than locks.  If this were not the case,
why bother creating the transactional execution facility?

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Friday, November 08, 2013 8:03 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

applicable to 99%+ of all serialization scenarios you encounter

To be frank, you might not have very complex serialization requirements. 
Also, using PLO when CS,CSG,CDS,CDSG would do is a significant waste of
cycles. For the cases I have seen within our code, uses of PLO (in the cases
where it is not better to use something simpler) are a tiny percentage of
our serialization needs.

When the updating process wakes up S0C4! 

Using PLO to update a free queue, as is the case with CPOOL and its
CDS-based free-queue protocol, requires that the queue elements *never* be
freed (unless you like potentially blowing up or, worse, overlaying
something you didn't intend to write into). Perhaps this is not well
understood.

I really don't see the big deal with an 0c4 in this scenario (should
happen rarely)

That's a scary statement. If you get an 0C4 you could probably deal with 
it. The real risk is the case where you don't get an 0C4 because the 
storage was re-allocated and used for something else and now it does not 
program check but overlays something.

I think I figured out a solution:

There are a lot of details missing in what was shown, but if you want my 
honest suspicion, it's that if this is a free queue type of protocol, it 
will not work.
The reason that the free queue protocol needs a sequence number is because 
even if the header matches, the values that you need to put into your 
new element for the next and/or previous may no longer be correct due 
to a set of additional asynchronous operations.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson
-Original Message-
From: Kenneth Wilkerson [mailto:redb...@austin.rr.com] 
Sent: Friday, November 08, 2013 8:46 AM
To: 'IBM Mainframe Discussion List'
Subject: RE: Serialization without Enque

I really don't see the big deal with an 0c4 in this scenario (should
happen rarely)

You misunderstood my point. You could use PLO to serialize a chain even if
the areas are released as they are deleted provided you always use PLO
Compare and Load to load the pointers and recovery sets a retry to re-drive
the process. As long as the count is updated each time you do a delete (and
release), if the delete occurs while some other management function is being
performed, the PLO Compare and Load will force the redrive either by CC or
0c4. If the storage had been reallocated but still in the same key, the PLO
would fail because the count has changed. PLO may fetch operands before the
lock and memory serialization so an 0c4 can occur on the PLO Compare and
Load. Either way, the PLO does not store so there is never an overlay.

I would never design an application to use PLO in this fashion. However, if
I had an existing application, I find this methodology more desirable than
getting a CMS lock everywhere I access the chain. I stand by my statement
that you can serialize 99+% of all scenarios using PLO and that P:LO
serialization is much more desirable than locks.  If this were not the case,
why bother creating the transactional execution facility?

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Friday, November 08, 2013 8:03 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

applicable to 99%+ of all serialization scenarios you encounter

To be frank, you might not have very complex serialization requirements. 
Also, using PLO when CS,CSG,CDS,CDSG would do is a significant waste of
cycles. For the cases I have seen within our code, uses of PLO (in the cases
where it is not better to use something simpler) are a tiny percentage of
our serialization needs.

When the updating process wakes up S0C4! 

Using PLO to update a free queue, as is the case with CPOOL and its
CDS-based free-queue protocol, requires that the queue elements *never* be
freed (unless you like potentially blowing up or, worse, overlaying
something you didn't intend to write into). Perhaps this is not well
understood.

I really don't see the big deal with an 0c4 in this scenario (should
happen rarely)

That's a scary statement. If you get an 0C4 you could probably deal with it.
The real risk is the case where you don't get an 0C4 because the storage was
re-allocated and used for something else and now it does not program check
but overlays something.

I think I figured out a solution:

There are a lot of details missing in what was shown, but if you want my
honest suspicion, it's that if this is a free queue type of protocol, it
will not work.
The reason that the free queue protocol needs a sequence number is because
even if the header matches, the values that you need to put into your new
element for the next and/or previous may no longer be correct due to a
set of additional asynchronous operations.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread Donald Likens
Thank You for your help (all of you) but Peter's statement below does not make 
sense to me (maybe because I don't understand something).

The reason that the free queue protocol needs a sequence number is because 
even if the header matches, the values that you need to put into your 
new element for the next and/or previous may no longer be correct due 
to a set of additional asynchronous operations.

I use PLO to add an element to a chain. The chain only has forward pointers. I 
always add to the end of the chain. I use storage (not CPOOL) to get the 
element. It turns out that I haven't gotten the PLO instruction to work 
properly yet but in theory in my scenario it seem to me that if the pointer to 
the last element is pointing to a element (Not 0) I should be able to store the 
next pointer and the last pointer in one PLO CSDST. Here is the actual (not 
working code... It is not updating the chain properly):

CSAMSEGL == Last element on the chain
CSAMSEGF == First element on the chain
R8 Address of element to add.
MSEGNEXT == Pointer to next element in last control block
MSEGCB == element name

DOX078   DS0H 
*C IF CSAMSEGL EQ 0 THEN(no elements on the chain)  

 XRR4,R4  
 XRR5,R5  
 LRR2,R8(R8: ADDRESS OF MSEGCB)   
 LRR3,R8  
*C   SET CSAMSEGL = CSAMSEGF = MSEGCB JUST BUILT  
 CDS   R4,R2,CSAMSEGF IF CSAMSEGF  CSAMSEGL = 0, STM 2,3,CSAMSEGF  
 BC4,ELIFX076 
*C ELSE   
 B EIFX076
ELIFX076 DS0H 
*C IF CSAMSEGL = CSAMSEGL (R2) 
*C   SET CSAMSEGL = POINTER_TO_NEW_MSEG (R8)   
*C   SET MSEGNEXT = POINTER_TO_NEW_MSEG (R8)   
CSDSTEQU   16  
 L  R2,CSAMSEGL  
 LAR0,CSDST 
 LAR1,PLT   
 
 LRR3,R8   
 LAR4,CSAMSEGL   CSAMSEGL IS IN CSA
 STR4,OPERAND6 
 LAR4,MSEGNEXT   MSEGNEXT IS IN CSA
 STR4,OPERAND4 
 STR8,OPERAND3 
 STR8,OPERAND5 
 PLO   R2,CSAMSEGL,0,PL
*  THE FIRST-OPERAND COMPARISON VALUE AND THE SECOND OPERAND ARE
*  COMPARED.  IF THEY ARE EQUAL, THE FIRST-OPERAND REPLACEMENT VALUE
*  IS STORED AT THE SECOND-OPERAND LOCATION, AND THE THIRD OPERAND IS   
*  STORED AT THE FOURTHOPERAND LOCATION. THEN, THE FIFTH OPERAND IS 
*  STORED AT THE SIXTH-OPERAND LOCATION.
 BNZ   DOX078   
EIFX076  DS0H
  
PLT  DSDPLO LOCK TOKEN 
PL   DS0F   PARAMETER LIST 
 ORG   PL+60   
OPERAND3 DSANEW MSEG ADDRESS   
 ORG   PL+76   
OPERAND4 DSAADDRESS OF CSAMSEGL
 ORG   PL+92   
OPERAND5 DSANEW MSEG ADDRESS   
 ORG   PL+108  
OPERAND6 DSAADDRESS OF MSEGNEXT  

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread Jon Perryman
This specific paragraph from Peter is about FREE QUEUE PROTOCOL. This is 
where elements on your chain are no longer needed. Peter recommends not freeing 
the element. Instead, you should use a queue of free elements that you reuse 
when you need a new element.

Peter's concern is not the chaining. It's with the data each element represents 
and making sure it remains consistent in a multi-processing environment. A 
program using the element is not guaranteed the element hasn't been freed and 
possibly re-allocated. PLO only ensures serialization of the chain.  To ensure 
the element is valid, header validation is not a enough. You should maintain a 
validation field that is occasionally verified. This greatly reduces the 
exposure but does not completely eliminate it. 

Peter's concern is valid and justified. S0C4 abends are visible so they can be 
dealt with. The real problem is storage overlays that are not immediately 
apparent. Even worse is when they affect unrelated programs.

Jon Perryman.
  




 From: Donald Likens dlik...@infosecinc.com


Thank You for your help (all of you) but Peter's statement below does not make 
sense to me (maybe because I don't understand something).

The reason that the free queue protocol needs a sequence number is because 
even if the header matches, the values that you need to put into your 
new element for the next and/or previous may no longer be correct due 
to a set of additional asynchronous operations.

I use PLO to add an element to a chain. The chain only has forward pointers. I 
always add to the end of the chain. I use storage (not CPOOL) to get the 
element. It turns out that I haven't gotten the PLO instruction to work 
properly yet but in theory in my scenario it seem to me that if the pointer to 
the last element is pointing to a element (Not 0) I should be able to store 
the next pointer and the last pointer in one PLO CSDST. Here is the actual 
(not working code... It is not updating the chain properly):

CSAMSEGL == Last element on the chain
CSAMSEGF == First element on the chain
R8 Address of element to add.
MSEGNEXT == Pointer to next element in last control block
MSEGCB == element name

DOX078   DS    0H                                                    
*C     IF CSAMSEGL EQ 0 THEN    (no elements on the chain)                     
                 
         XR    R4,R4                                                  
         XR    R5,R5                                                  
         LR    R2,R8    (R8: ADDRESS OF MSEGCB)                      
         LR    R3,R8                                                  
*C       SET CSAMSEGL = CSAMSEGF = MSEGCB JUST BUILT                  
         CDS   R4,R2,CSAMSEGF IF CSAMSEGF  CSAMSEGL = 0, STM 2,3,CSAMSEGF  
         BC    4,ELIFX076                                            
*C     ELSE                                                          
         B     EIFX076                                                
ELIFX076 DS    0H                                                    
*C IF CSAMSEGL = CSAMSEGL (R2)                        
*C   SET CSAMSEGL = POINTER_TO_NEW_MSEG (R8)          
*C   SET MSEGNEXT = POINTER_TO_NEW_MSEG (R8)          
CSDST    EQU   16                                      
         L      R2,CSAMSEGL  
         LA    R0,CSDST                                
         LA    R1,PLT                                                          
  
         LR    R3,R8                                  
         LA    R4,CSAMSEGL   CSAMSEGL IS IN CSA        
         ST    R4,OPERAND6                            
         LA    R4,MSEGNEXT   MSEGNEXT IS IN CSA        
         ST    R4,OPERAND4                            
         ST    R8,OPERAND3                            
         ST    R8,OPERAND5                            
         PLO   R2,CSAMSEGL,0,PL                        
*  THE FIRST-OPERAND COMPARISON VALUE AND THE SECOND OPERAND ARE        
*  COMPARED.  IF THEY ARE EQUAL, THE FIRST-OPERAND REPLACEMENT VALUE    
*  IS STORED AT THE SECOND-OPERAND LOCATION, AND THE THIRD OPERAND IS  
*  STORED AT THE FOURTHOPERAND LOCATION. THEN, THE FIFTH OPERAND IS    
*  STORED AT THE SIXTH-OPERAND LOCATION.                                
         BNZ   DOX078  
EIFX076  DS    0H
  
PLT      DS    D        PLO LOCK TOKEN        
PL       DS    0F       PARAMETER LIST        
         ORG   PL+60                          
OPERAND3 DS    A        NEW MSEG ADDRESS      
         ORG   PL+76                          
OPERAND4 DS    A        ADDRESS OF CSAMSEGL    
         ORG   PL+92                          
OPERAND5 DS    A        NEW MSEG ADDRESS      
         ORG   PL+108                          
OPERAND6 DS    A        ADDRESS OF MSEGNEXT                              

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to 

Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson
First, I'm not sure why you have chosen PLT as your lock word. It's very 
important the lock word resolve to the same REAL address no matter where the 
PLO executes. Since you are talking about multiple operations against the same 
chain, unless all the processes exist in the same shared program you can't be 
sure the same lock word is always used. The lock word contents are not altered 
just used to create a lock value(PLT) for serialization. From your code, I 
would chose CSAMSEGF.

 Second, you can't mix the CDS with PLO and expect consistent results. It would 
be very easy to convert the CDS to a PLO Compare Double and Swap (compare each 
full word and swap). Just be sure to use the same lock word which I still 
suggest as CSAMSEGF.

Third, you're going to have to use a count. And since you are acquiring and 
releasing this storage, you REALLY have to use a counter. Just add a full word 
counter in CSA. In this case, I would choose the counter as the lock word. As 
long as you have a full word counter and recovery to treat a 0c4 as a re-drive 
(see my prior comments), this should work. I don't know if you can, but the 
choice of another methodology where storage is not acquired for each add and 
released for each delete would be recommended. However, you can make this work 
but not without a counter. 

Last, Peter's comment is very valid. Read the notes in CDS in the POM. I don't 
know why PLO doesn't reference them since they are just as applicable to PLO.

 I have not closely examined your logic so I don't know what the logic problem 
is in the code. I'm just commenting on the methodology.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Donald Likens
Sent: Friday, November 08, 2013 10:22 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Thank You for your help (all of you) but Peter's statement below does not make 
sense to me (maybe because I don't understand something).

The reason that the free queue protocol needs a sequence number is because even 
if the header matches, the values that you need to put into your new element 
for the next and/or previous may no longer be correct due to a set of 
additional asynchronous operations.

I use PLO to add an element to a chain. The chain only has forward pointers. I 
always add to the end of the chain. I use storage (not CPOOL) to get the 
element. It turns out that I haven't gotten the PLO instruction to work 
properly yet but in theory in my scenario it seem to me that if the pointer to 
the last element is pointing to a element (Not 0) I should be able to store the 
next pointer and the last pointer in one PLO CSDST. Here is the actual (not 
working code... It is not updating the chain properly):

CSAMSEGL == Last element on the chain
CSAMSEGF == First element on the chain
R8 Address of element to add.
MSEGNEXT == Pointer to next element in last control block MSEGCB == element 
name

DOX078   DS0H 
*C IF CSAMSEGL EQ 0 THEN(no elements on the chain)  

 XRR4,R4  
 XRR5,R5  
 LRR2,R8(R8: ADDRESS OF MSEGCB)   
 LRR3,R8  
*C   SET CSAMSEGL = CSAMSEGF = MSEGCB JUST BUILT  
 CDS   R4,R2,CSAMSEGF IF CSAMSEGF  CSAMSEGL = 0, STM 2,3,CSAMSEGF  
 BC4,ELIFX076 
*C ELSE   
 B EIFX076
ELIFX076 DS0H 
*C IF CSAMSEGL = CSAMSEGL (R2) 
*C   SET CSAMSEGL = POINTER_TO_NEW_MSEG (R8)   
*C   SET MSEGNEXT = POINTER_TO_NEW_MSEG (R8)   
CSDSTEQU   16  
 L  R2,CSAMSEGL  
 LAR0,CSDST 
 LAR1,PLT   
 
 LRR3,R8   
 LAR4,CSAMSEGL   CSAMSEGL IS IN CSA
 STR4,OPERAND6 
 LAR4,MSEGNEXT   MSEGNEXT IS IN CSA
 STR4,OPERAND4 
 STR8,OPERAND3 
 STR8,OPERAND5 
 PLO   R2,CSAMSEGL,0,PL
*  THE FIRST-OPERAND COMPARISON VALUE AND THE SECOND OPERAND ARE
*  COMPARED.  IF THEY ARE EQUAL, THE FIRST-OPERAND REPLACEMENT VALUE
*  IS STORED AT THE SECOND-OPERAND LOCATION, AND THE THIRD OPERAND IS   
*  STORED AT THE FOURTHOPERAND LOCATION. THEN, THE FIFTH OPERAND

Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson
A storage overlay cannot occur in a properly implemented PLO with a counter
as long as the counter is properly maintained with every process
incrementing it by 1. Even in in a free chain implementation, an improper
PLO sequence can result in a circular or broken chain.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Friday, November 08, 2013 11:44 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

This specific paragraph from Peter is about FREE QUEUE PROTOCOL. This is
where elements on your chain are no longer needed. Peter recommends not
freeing the element. Instead, you should use a queue of free elements that
you reuse when you need a new element.

Peter's concern is not the chaining. It's with the data each element
represents and making sure it remains consistent in a multi-processing
environment. A program using the element is not guaranteed the element
hasn't been freed and possibly re-allocated. PLO only ensures serialization
of the chain.  To ensure the element is valid, header validation is not a
enough. You should maintain a validation field that is occasionally
verified. This greatly reduces the exposure but does not completely
eliminate it. 

Peter's concern is valid and justified. S0C4 abends are visible so they can
be dealt with. The real problem is storage overlays that are not immediately
apparent. Even worse is when they affect unrelated programs.

Jon Perryman.
  




 From: Donald Likens dlik...@infosecinc.com


Thank You for your help (all of you) but Peter's statement below does not
make sense to me (maybe because I don't understand something).

The reason that the free queue protocol needs a sequence number is 
because even if the header matches, the values that you need to put 
into your new element for the next and/or previous may no longer be 
correct due to a set of additional asynchronous operations.

I use PLO to add an element to a chain. The chain only has forward
pointers. I always add to the end of the chain. I use storage (not CPOOL) to
get the element. It turns out that I haven't gotten the PLO instruction to
work properly yet but in theory in my scenario it seem to me that if the
pointer to the last element is pointing to a element (Not 0) I should be
able to store the next pointer and the last pointer in one PLO CSDST. Here
is the actual (not working code... It is not updating the chain properly):

CSAMSEGL == Last element on the chain
CSAMSEGF == First element on the chain
R8 Address of element to add.
MSEGNEXT == Pointer to next element in last control block MSEGCB == 
element name

DOX078   DS    0H *C     IF CSAMSEGL EQ 0 THEN    (no elements on the 
chain)
         XR    R4,R4
         XR    R5,R5
         LR    R2,R8    (R8: ADDRESS OF MSEGCB)
         LR    R3,R8 *C       SET CSAMSEGL = CSAMSEGF = MSEGCB JUST 
BUILT
         CDS   R4,R2,CSAMSEGF IF CSAMSEGF  CSAMSEGL = 0, STM 
2,3,CSAMSEGF
         BC    4,ELIFX076 *C     ELSE
         B     EIFX076
ELIFX076 DS    0H *C IF CSAMSEGL = CSAMSEGL (R2) *C   SET CSAMSEGL = 
POINTER_TO_NEW_MSEG (R8) *C   SET MSEGNEXT = POINTER_TO_NEW_MSEG (R8) 
CSDST    EQU   16
         L      R2,CSAMSEGL
         LA    R0,CSDST
         LA    R1,PLT
         LR    R3,R8
         LA    R4,CSAMSEGL   CSAMSEGL IS IN CSA
         ST    R4,OPERAND6
         LA    R4,MSEGNEXT   MSEGNEXT IS IN CSA
         ST    R4,OPERAND4
         ST    R8,OPERAND3
         ST    R8,OPERAND5
         PLO   R2,CSAMSEGL,0,PL
*  THE FIRST-OPERAND COMPARISON VALUE AND THE SECOND OPERAND ARE
*  COMPARED.  IF THEY ARE EQUAL, THE FIRST-OPERAND REPLACEMENT VALUE
*  IS STORED AT THE SECOND-OPERAND LOCATION, AND THE THIRD OPERAND IS
*  STORED AT THE FOURTHOPERAND LOCATION. THEN, THE FIFTH OPERAND IS
*  STORED AT THE SIXTH-OPERAND LOCATION.
         BNZ   DOX078
EIFX076  DS    0H
  
PLT      DS    D        PLO LOCK TOKEN PL       DS    0F       
PARAMETER LIST
         ORG   PL+60
OPERAND3 DS    A        NEW MSEG ADDRESS
         ORG   PL+76
OPERAND4 DS    A        ADDRESS OF CSAMSEGL
         ORG   PL+92
OPERAND5 DS    A        NEW MSEG ADDRESS
         ORG   PL+108
OPERAND6 DS    A        ADDRESS OF MSEGNEXT

--
For IBM-MAIN subscribe / signoff / archive access instructions, send 
email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN




--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread Jon Perryman
The storage overlay does not pertain to the PLO. It pertains to the entire 
element not being immediately removed from any type of use. Just because you 
removed the element from the chain does not mean it's not in use somewhere. You 
can't even say how long the element may be in use (e.g. task does not get any 
CPU because of CPU load or swapped out address space in multi-address space 
serialization).

Jon Perryman.





 From: Kenneth Wilkerson redb...@austin.rr.com



A storage overlay cannot occur in a properly implemented PLO with a counter
as long as the counter is properly maintained with every process
incrementing it by 1. Even in in a free chain implementation, an improper
PLO sequence can result in a circular or broken chain.

-Original Message-
Behalf Of Jon Perryman

This specific paragraph from Peter is about FREE QUEUE PROTOCOL.


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread John Gilmore
Jon Perryman wrote:

begin extract
This is where elements on your chain are no longer needed. Peter
recommends not freeing the element. Instead, you should use a queue of
free elements that you reuse when you need a new , , ,
/end extract

and I have two comments.

First, 'queue of free elements'  need not be taken literally.  These
elements can and should be managed as a pool, implemented as a LIFO
list not because LIFO sequencing is itself important but because it is
simpler than alternatives to it.

Second, such lists and their associated pools are better managed using
a primary element-allocation quantity p and a secondary
element-allocation quantity s, with s = p.

When processing is initialized p elements are added to the pool.
Thereafter an element is always removed from the pool when one is
needed and returned to the pool when it is no longer needed.

The secondary allocation s figures in processing in just one way.  If
the pool is found to be empty when an attempt is made to remove an
element from it s elements are added to the pool.

A further operation is usually appropriate.  Testing to ensure that
the current pool-element count c is less than s when an element is to
be returned to the pool, and freeing its storage instead iff c = s
ensures that surges of acrtivity do not result in the accumulation of
excessive numbers of elements in the pool.

This scheme has the merit that it is 0c4-proof.  No freed storage is
reused.  It is also simple, fast and tunable.


John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson
Yes, I was talking about all references using PLO. I was also assuming this
was a work queue where deletion from the chain was the methodology for
claiming ownership of the element. However, any serialization method that
performs deletion must also have a method for claiming ownership before an
element can be deleted. It doesn't matter whether deletion is an actual
release of storage or the placement of the element into a free chain. If any
process other than the owning process maintains a reference to an element
without claiming it, the problem exists whether you use locks, PLO, CS,
whatever. If the reference is nothing more than searching the chain, then
PLO Compare and Load can solve that. If the reference is more than that, the
problem is not storage overlays or 0c4s. The problem is a missing method for
ownership. If this is true for this application, the chain is only
serializable with a lock and the lock must be held throughout the period
where the element is referenced before the element can be safely deleted. 


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Friday, November 08, 2013 1:11 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

The storage overlay does not pertain to the PLO. It pertains to the entire
element not being immediately removed from any type of use. Just because you
removed the element from the chain does not mean it's not in use somewhere.
You can't even say how long the element may be in use (e.g. task does not
get any CPU because of CPU load or swapped out address space in
multi-address space serialization).

Jon Perryman.





 From: Kenneth Wilkerson redb...@austin.rr.com



A storage overlay cannot occur in a properly implemented PLO with a 
counter as long as the counter is properly maintained with every 
process incrementing it by 1. Even in in a free chain implementation, 
an improper PLO sequence can result in a circular or broken chain.

-Original Message-
Behalf Of Jon Perryman

This specific paragraph from Peter is about FREE QUEUE PROTOCOL.


--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread John Gilmore
Who frees what is always important.  In general a list manager should
have no notion and make no assumptions about the structure or
organization of the things it puts into and removes from the list.

I tell my students that they should think of these things as SOTs,
instances of Something Out There.  What they should deal in is a
pointer to a SOT [and sometimes a key too]  and a chaining block (CB)
specific to the list, which always contains 1) a pointer to the SOT
and 2) the additional chaining pointers required by the list
organization.

The SOT belongs to a 'user'.  Its storage is never freed.  The CBs
belong to the list, which allocates and frees their storage.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread Jon Perryman
I'm guessing that Peter was warning against the situation where you have an 
element table rather than a queue. Queues are straight forward because each 
entry will only be in use by 1 program at any given time (ownership is always 
the task using it). Element tables on the other hand may have multiple tasks 
referencing the same entry at the same time. An example that is easily 
understood is dynamic LPA. These modules exist in common storage and you may 
dynamically replace them with a new version. Anyone still using the old version 
must not abend so IBM leaves all old versions in storage until the next IPL. 
Since you don't know if anyone is using the old version, you can't free / reuse 
it. In the old days when CSA was constrained, it was a real problem to use such 
commands. Today, we don't think twice about it but it still eats up some CSA.

As for handling a queue, I use  CS with the entries LIFO on the queue. A common 
task removes the queued entries and takes ownership.  At that time, it repairs 
the forward pointers and gets the first entry. I haven't heard anything here 
that would make me think that PLO would be less expensive so I wouldn't bother 
changing it's implementation

Jon Perryman.




 From: Kenneth Wilkerson redb...@austin.rr.com



Yes, I was talking about all references using PLO. I was also assuming this
was a work queue where deletion from the chain was the methodology for
claiming ownership of the element. 


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-08 Thread Peter Relson
Looks like my impossible with respect to the free queue protocol needed 
to be qualified by until PLO came along as long as you have storage to 
work with for the PLO parameters.

As was pointed out to me,
PLO CL  and PLO DCS can be used to implement the free queue protocol (and 
not need the sequence number)

I don't think that's the protocol that the OP was going after, however, 
but I've lost track.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-07 Thread Peter Relson
I was wondering if there was
quad-word consistency as well?

I'm sure there is

I always insure that the primary counter
is in the first double word.

This is not necessary because of the quadword consistency. 

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-07 Thread Donald Likens
It has taken me this long to mostly understand PLO... I must be slow.

Now that I understand it (mostly) I am pretty sure it will not work for me. My 
problem is that a process comes in and removes the control block chain while 
another process is suspended and attempting to update the chain. When the 
updating process wakes up S0C4! That is why I was looking at using locks. If 
the process updating the chain holds a lock and the process removing the chain 
needs that lock to update the pointers this would not happen. So back to my 
original question:

My code must be able to run in SRB mode and with locks held. I have a situation 
where I need to serialize processing and cannot use CDS because the two 
addresses being updated cannot be next to each other (because I use CDS with 
these two addresses with other addresses). I have attempted to use a 
combination of CS instructions to resolve this problem but it does not work. I 
know this will work if I use a CMS lock but I am concerned about affecting the 
whole system. Any advice?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-07 Thread Kenneth Wilkerson
If your application is not designed to use PLO for serialization, it'll 
definitely not work for you. I use PLO for serialization because of issues with 
locks that you are describing (system affects) and many others. All my code can 
run as SRBs but unlike what you describe I almost never acquire locks. I always 
use PLO except when interfacing with MVS services that require locks which I do 
rarely.  Besides you can't mix CS with PLO meaning you would have to convert 
every CS affecting that chain. Sounds to me like you're stuck with the CMS 
lock. And if it’s a process done frequently, it will have a system impact; a 
significantly greater impact than a PLO serialized one. 

The thing I don't understand is you're statement My problem is that a process 
comes in and removes the control block chain while another process is suspended 
and attempting to update the chain. When the updating process wakes up S0C4! 
If you mean that you're releasing the storage for the chain, the process doing 
the release could use a PLO to set the chain pointers to 0 and in that process 
update the swap word. The second process will then get a CC forcing re-drive 
and it'll discover the chain is  now 0. In that case, that would probably mean 
that you would need to use PLO Compare and loads for every reference to those 
chain pointers or (my preference) you would have a retry point detect the 0c4 
and realize the chain has been released and just continue . I really don't see 
the big deal with an 0c4 in this scenario (should happen rarely). The PLO 
process does not update anything until the PLO executes so no harm no foul. 
Again, if the application is not designed to use PLO, it won't work. 

And you're not slow. When I first started using PLO almost 10 years ago. I 
spent days writing test cases with special traces so I could see what was 
happening. The whole time I had the POM open to the PLO instruction reading it 
over and over. Now, I consistently code PLO without much thought. If you chose 
to spend time learning the PLO, I think you'll find it vastly superior to locks 
(no restrictions) and applicable to 99%+ of all serialization scenarios you 
encounter provided you design the application to use PLO from the start.

Kenneth 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Donald Likens
Sent: Thursday, November 07, 2013 8:33 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

It has taken me this long to mostly understand PLO... I must be slow.

Now that I understand it (mostly) I am pretty sure it will not work for me. My 
problem is that a process comes in and removes the control block chain while 
another process is suspended and attempting to update the chain. When the 
updating process wakes up S0C4! That is why I was looking at using locks. If 
the process updating the chain holds a lock and the process removing the chain 
needs that lock to update the pointers this would not happen. So back to my 
original question:

My code must be able to run in SRB mode and with locks held. I have a situation 
where I need to serialize processing and cannot use CDS because the two 
addresses being updated cannot be next to each other (because I use CDS with 
these two addresses with other addresses). I have attempted to use a 
combination of CS instructions to resolve this problem but it does not work. I 
know this will work if I use a CMS lock but I am concerned about affecting the 
whole system. Any advice?   

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-07 Thread Tony Harminc
On 7 November 2013 09:33, Donald Likens dlik...@infosecinc.com wrote:

 Now that I understand it (mostly) I am pretty sure it will not work for me. 
 My problem is that a process comes in and removes the control block chain 
 while another process is suspended and attempting to update the chain. When 
 the updating process wakes up S0C4! That is why I was looking at using locks. 
 If the process updating the chain holds a lock and the process removing the 
 chain needs that lock to update the pointers this would not happen.

Why don't you show us your control block layout, and explain how many
work units can add items, and how, how many can read, etc. There are
few of these things that truly require use of locks/ENQs and the like.

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-07 Thread Donald Likens
I think I figured out a solution:

DOX078   DS0H
*C IF LASTCB EQ 0 THEN 
*C   SET LASTCB= FIRSTCB  = MSEGCB JUST BUILT 
 CDS   R4,R2,FIRSTCB  IF MSEGF  MSEGL = 0, STM 2,3,FIRSTCB  
*C ELSE  
*C   SET FIRSTCB  = POINTER_TO_NEW_MSEG IF FIRSTCB  = ZERO   
 CSR4,R8,FIRSTCB  IF FIRSTCB  = 0, ST R8 FIRSTCB 
*C IF LASTCB= LASTCB (R2)   
*C   SET LASTCB= POINTER_TO_NEW_MSEG (R1) 
*C   SET NEXTCB = POINTER_TO_NEW_MSEG (R1) 
 LAR0,CSDST  
 PLO   R2,CSAMSEGL,0,PL  
 BNZ   DOX078   


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-07 Thread Jon Perryman
If I understand your situation, then you are maintaining a queue. Today, you 
are probably queuing in the reverse order using CS (LIFO). You want to 
eliminate reversing the queue (FIFO) by using PLO to create the queue FIFO.

PLO is desgned to solve this problem. PLO doesn't care about the store fields 
and values. It only serializes the counter that has been described. In your 
case, the serialization is to represent any change to the queue regardless of 
the location on the queue.  

Fields: counter, address of first queued CB and address of last queued CB.

To remove queued entries:
1. Load and increment counter. Must be first to ensure correct to ensure step 2 
is the correct CB address..
2. Save first queued CB address.
3. PLO CSDST counter and store 0 at first  last CB address
4. Repeat if PLO failed.

To add to the queue
1. Load and increment counter
2. Copy last CB address to the backwards pointer of the new CB.
3. PLO CSDST counter and store new CB address in the forward pointer of the 
last CB  and store the new CB address in the last CB address.
4. Repeat if PLO fails.

You could serialize updates to any entry in the queue, just the same as the add 
but the second step would be to locate the entry to be updated. If you  If you 
are deleting entries, you might need to use PLO compare and load  when scanning 
thru the chain.

Is it more efficient to continue using CS / reverse chain or to use PLO to 
build the chain in the correct sequence?

Jon Perryman.



 From: Tony Harminc t...@harminc.net


 Now that I understand it (mostly) I am pretty sure it will not work for me. 
 My problem is that a process comes in and removes the control block chain 
 while another process is suspended and attempting to update the chain. When 
 the updating process wakes up S0C4! That is why I was looking at using 
 locks. If the process updating the chain holds a lock and the process 
 removing the chain needs that lock to update the pointers this would not 
 happen.


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-07 Thread Shmuel Metz (Seymour J.)
In
ofe01adb17.e9264150-on85257c1b.0044d053-85257c1b.00455...@us.ibm.com,
on 11/06/2013
   at 07:37 AM, Peter Relson rel...@us.ibm.com said:

One of the shortcomings of PLO (unlike TBEGIN(C) ) is that PLO in
general  serializes only against other uses of PLO.

I'd hardly label that as a shortcoming of PLO. CS only serializes
agains CS, ENQ only serializes against ENQ, etc.; it's the nature of
the beast.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-07 Thread Tony Harminc
On 6 November 2013 16:30, Shmuel Metz (Seymour J.)
shmuel+ibm-m...@patriot.net wrote:
Peter Relson rel...@us.ibm.com said:

One of the shortcomings of PLO (unlike TBEGIN(C) ) is that PLO in
general  serializes only against other uses of PLO.

 I'd hardly label that as a shortcoming of PLO.

 CS only serializes agains CS,

It serializes happily against all the CS variations, TS, and the newer
interlocked-update instructions like ASI, LAA, and so on. And there
are cases where a simple ST or the like can interoperate usefully with
CS. For instance, if you update a counter with CS, it is safe to zero
it with ST.

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-07 Thread Anne Lynn Wheeler
t...@harminc.net (Tony Harminc) writes:
 It serializes happily against all the CS variations, TS, and the newer
 interlocked-update instructions like ASI, LAA, and so on. And there
 are cases where a simple ST or the like can interoperate usefully with
 CS. For instance, if you update a counter with CS, it is safe to zero
 it with ST.

compare-and-swap is atomic instruction ... it does the compare and does
the store if the compare matches ... it solves the problem of interrupts
and other processes doing something while interrupted. normal process is
if the compare doesn't match ... and the store isn't done ... loop back
and restart the operation.

charlie had invented compare-and-swap while doing fine-grain
multiprocessing locking for cp67 at the cambridge science center
past posts mentioning science center
http://www.garlic.com/~lynn/subtopic.html#545tech

... compare-and-swap was chosen for the name of the instruction because
CAS are charlie's initials. past posts mentionin smp and/or
compare-and-swap
http://www.garlic.com/~lynn/subtopic.html#smp

the initial attempt to include compare-and-swap in 370 was rejected
because the pok favorite son operating system people said it wasn't
needed, that TS was more than sufficient for multiprocessor locking
(single kernel spin-lock). the 370 architecture owners said that to get
compare-and-swap justified for 370 (over pok objections), had to come up
with purposes other than multiprocessor locking. Thus was born the
examples (still in principles of operation) for interrupt-enabled,
multi-threaded applications (like large dbms) ... whether or not running
in multiprocessor configuration or not.

-- 
virtualization experience starting Jan1968, online at home since Mar1970

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-06 Thread Peter Relson
One of the shortcomings of PLO (unlike TBEGIN(C) ) is that PLO in general 
serializes only against other uses of PLO. It does not serialize against 
CS on the same storage, for example.

However, cache considerations and doubleword consistency still come into 
play.

A LM of 2 words of a doubleword is done with what is referred to as 
doubleword consistency. That matters. 
If you need to load two consecutive words and you can arrange that those 
two words are in the same doubleword, it can be to your advantage.
It's why in a doubleword serialized by CDS you do not typically (if ever) 
want to load both of the individual words consecutively (as you might get 
results that come half from one CDS and half from another CDS); you want 
to use LM.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-06 Thread Kenneth Wilkerson
Thank you for mentioning the issue with CS/CDS. I have always understood
that if you use PLO anywhere  to serialize access to an area,  you must use
it everywhere to serialize access to that area.  It's nice to know that the
transactional facility serializes it against CS as well. I wish I had access
to a processor to play with it.

I always understood that the reason PLO did a memory serialization at the
end of a PLO was to insure that any processors that were referencing the
swap word(s) in a compare and swap would get consistent results; either the
swap value(32, 64 or 128) before the swap or after the  swap. Since the swap
value is always stored last and provided you always loaded the swap value
first, you would either get the correct swap and store values and the
subsequent PLO would succeed or you would get an outdated swap value with
inconsistent store values and the subsequent PLO would return a CC to
re-drive. I have written software traces for PLO and my observations have
supported this understanding of the POM's description of PLO and memory
serialization. Is this your understanding?

I often use 128 bit operations with consecutive words and double words to
perform some sophisticated operations. I've understood about double word
consistency. I've always assumed that PLO 128 bit required quad-word
alignment for 128 bit operations for the same reason. I always load the swap
value (in any flavor compare and swap and store) first. I always use a LMG
(even if its consecutive words) and I always insure that the primary counter
is in the first double word. I've read the POM many times looking for any
references to quad-word consistency. It doesn't really matter because of the
way I design PLO compare and swaps but I was wondering if there was
quad-word consistency as well?

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Wednesday, November 06, 2013 6:37 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

One of the shortcomings of PLO (unlike TBEGIN(C) ) is that PLO in general
serializes only against other uses of PLO. It does not serialize against CS
on the same storage, for example.

However, cache considerations and doubleword consistency still come into
play.

A LM of 2 words of a doubleword is done with what is referred to as
doubleword consistency. That matters. 
If you need to load two consecutive words and you can arrange that those two
words are in the same doubleword, it can be to your advantage.
It's why in a doubleword serialized by CDS you do not typically (if ever)
want to load both of the individual words consecutively (as you might get
results that come half from one CDS and half from another CDS); you want to
use LM.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-05 Thread Jon Perryman
Sorry for the confusion but that's not the question that I was asking. I agree 
with you on guaranteeing the consistency using the count.

I'm talking about TCB1 using PLO CSDST to store 2 adjacent words (4th  6th PLO 
operands) and TCB2 using LM or LG for those same 2 words. There is a very small 
window between the 2 store's where TCB2 will pick up inconsistent inconsistent 
values. In other words, the first store has completed and the LM/LG occurs 
before the second store completes. This window is extremely small because PLO 
cannot be interrupted and the instruction was prepared before performing the 
stores. 

I think the window is so small that even under heavy usage, you would only see 
an error every couple of months but it does exist. I think TCB2 must also use 
the PLO compare and load to avoid this situation.

Thanks for the great information, Jon Perryman.

From: Kenneth Wilkerson redb...@austin.rr.com


The order of stores is unpredictable except that  according to the POM,
operand 2 (in this case, the count) is always stored last. 


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-05 Thread Kenneth Wilkerson
This is a complicated question that is very dependent on the design of your
application. So if the design needs to use a PLO Compare and Load, by all
means do so. I try to design these applications to avoid as many PLO Compare
and Loads as possible. The memory serialization (once at the start and once
at the end) are expensive but much less so than software locks. This means
that I design the update operations so that loading the pointers during a
PLO operation will; at worse, simply result in outdated information. As I
explained in my first example, this is possible regardless of whether you
use PLO or a lock. It's just a consequence of concurrent activity and the
order in which these activities occur. 

In reality, I don't normally use PLO Compare and Swap and store (any flavor)
for chains. They work fine for singly linked lists.  Usually I use it (just
wrote something today) to dynamically add an entry to a pre-allocated slot
in a cell. In this case, the cell has pre-allocated slots with a high water
mark pointer and a count, consecutive words in a double word. In this case,
the high water mark and count are the 2nd operand and the lock word. Since
the  2nd operand always updates last, any references would not even know of
the new addition until the PLO completes. 

I fall back to my original provision. The use of PLO is heavily dependent on
designing the application to use PLO. Just yesterday, I debugged a problem
in a task that was not doing a PLO Compare and Load to get a count.
Certainly, all references to the counts in a Compare and Swap Store should
all be updated with a PLO and probably require a PLO Compare and Load. 


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Tuesday, November 05, 2013 10:50 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Sorry for the confusion but that's not the question that I was asking. I
agree with you on guaranteeing the consistency using the count.

I'm talking about TCB1 using PLO CSDST to store 2 adjacent words (4th  6th
PLO operands) and TCB2 using LM or LG for those same 2 words. There is a
very small window between the 2 store's where TCB2 will pick up inconsistent
inconsistent values. In other words, the first store has completed and the
LM/LG occurs before the second store completes. This window is extremely
small because PLO cannot be interrupted and the instruction was prepared
before performing the stores. 

I think the window is so small that even under heavy usage, you would only
see an error every couple of months but it does exist. I think TCB2 must
also use the PLO compare and load to avoid this situation.

Thanks for the great information, Jon Perryman.

From: Kenneth Wilkerson redb...@austin.rr.com


The order of stores is unpredictable except that  according to the POM, 
operand 2 (in this case, the count) is always stored last.


--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Rob Scott
PLO CSDST and CSTST are *extremely* useful for queue and linked list 
manipulation in multi-ASID multi-TCB environments. The key to their use is to 
have a lock word counter that the caller increments and then prepares the new 
values in other regs. When it comes time to actually atomically update the lock 
word, you can redrive the structure manipulation logic if the CC indicates that 
the lock word value has changed, otherwise the other fields are updated 
atomically.

For actual practical uses, it is well worth putting all this inside some sort 
of macro or small stub service as you do not want to have to code the guts of 
it each time. 

I also think the uptake of PLO would be greater if there were some decent 
example code in the manuals - for instance a client adding a request to the 
tail of the queue whilst a server is removing from the head.


Rob Scott
Rocket Software
 On 4 Nov 2013, at 03:58, Jon Perryman jperr...@pacbell.net wrote:
 
 I sure missed that one with the locks. 
 
 PLO CDS does exactly what is wanted.  It does 2 CS's within the locked 
 instruction. 
 
 PLO CSDST on the other hand only does a single CS followed by 2 ST's. Since 3 
 separate load instructions (not under PLO control) are required when not in 
 contiguous storage, there is not any method that will guarantee the 3 values 
 are consistent with the others. A counter as suggested by Peter Relson won't 
 help either for this same reason.
 
 
 I can't think of a situation where PLO CSDST is useful. Can anyone describe a 
 situation where it is useful?
 
 Jon Perryman.
 
 
 
 
 From: Rob Scott rsc...@rocketsoftware.com
 
 
 
 I think the OP stated that his code could hold locks - in which case the 
 latch services cannot be used.
 
 --
 For IBM-MAIN subscribe / signoff / archive access instructions,
 send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread David Crayford

On 4/11/2013 4:49 PM, Rob Scott wrote:

I also think the uptake of PLO would be greater if there were some decent 
example code in the manuals - for instance a client adding a request to the 
tail of the queue whilst a server is removing from the head.


Maybe somebody with expertise should blog about it with example code.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Binyamin Dissen
If you truly need a triple compare and swap then PLO will not help you. But if
you need a disjoint double compare and swap, you use the compare-and-swap
field as a counter and then you con do a compare swap and double store.

Example:

 Fetch counter
A   PLO  compare-and-fetch value1
 CC0, go to A
 PLO  compare-and-fetch value 2
 CC0, go to A
 calculate new value1 and 2
 Add one to fetched counter
 PLO CSDST fetched-counter new-fetched-counter, new value1, new-value2
 CC0, go to A

This requires that all processes that update value1 or value2 use PLO with the
same lock word.

On Sun, 3 Nov 2013 19:58:12 -0800 Jon Perryman jperr...@pacbell.net wrote:

:I sure missed that one with the locks. 
:
:PLO CDS does exactly what is wanted.  It does 2 CS's within the locked 
instruction. 
:
:PLO CSDST on the other hand only does a single CS followed by 2 ST's. Since 3 
separate load instructions (not under PLO control) are required when not in 
contiguous storage, there is not any method that will guarantee the 3 values 
are consistent with the others. A counter as suggested by Peter Relson won't 
help either for this same reason.
:
:
:I can't think of a situation where PLO CSDST is useful. Can anyone describe a 
situation where it is useful?
:
:Jon Perryman.
:
:
:
:
: From: Rob Scott rsc...@rocketsoftware.com
:
:
:
:I think the OP stated that his code could hold locks - in which case the 
latch services cannot be used.
:
:
:--
:For IBM-MAIN subscribe / signoff / archive access instructions,
:send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
Binyamin Dissen bdis...@dissensoftware.com
http://www.dissensoftware.com

Director, Dissen Software, Bar  Grill - Israel


Should you use the mailblocks package and expect a response from me,
you should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems,
especially those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Peter Relson
PLO CDS does exactly what is wanted.

I presume this was a typo and should have been PLO DCS (double compare and 
swap)

PLO CSDST (and CSST and CSTST) have limitless potential exploitations. It 
all depends on what your requirements are.
Suppose you have N (2) separate fields all of which need to be serialized 
together and the happy circumstance is that you never need to update more 
than 2 of them together. A counter (subject of the CS) and store(s) may be 
used to do the updates.

Since 3 separate load instructions

I wonder if a double compare and swap can guarantee the consistency of 3 
load instructions. To the extent that a free queue protocol is 
analogous, you have to use a sequence number to get that right.

(not under PLO control)

Why is that a requirement? It is far from unheard of to use PLO 
Compare-and-load

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson
I have used PLO almost exclusively for serialization in multi-address space,
multi-du code for almost 10 years. I use all 6 operations. Since everything
I write is 64 bit mode, I generally use the +2 variant (64 bit length) but I
like using the +3 variant (128 bit length) for some really cool stuff. As
Rob pointed out,  The key to their use is to have a lock word counter that
the caller increments and then prepares the new values in other regs.. But
I add that the real key is designing the application to use PLO for
serialization which is much more than just writing macros to do the guts of
the processes (though I almost always use macros).

Consider this example. I have a singly linked chain of 64 bit cell addresses
on quad word boundaries. The chain pointers are a  quad word at the start of
the cell. The first double word is the head of the active chain. The second
double word is the head of the free chain. The first quad word of each cell
is a double word pointer to the next active entry followed by a double word
pointer to the next free entry. The chain has a quad word counter on a quad
word boundary. The first double word of the counter is a change count. The
second double word is an active element count. The algorithm always adds new
cells to the head of the list. 

I can add  a new cell by using a LMG to load the counters, increment each
counter by 1, compute the new head, compute the old head's new previous and
then use a Compare and swap and double store 128 bit to add the new entry.
Since every update increments the first double word counter by 1, the
process only completes if no other process updated the counter. If the
counter has changed, it needs to re-drive. By adding entries to the head, I
can also have code simultaneously searching the chain while entries are
added. Of course, if the new head is added before the search starts, it
won't be found. But that's no different than using a lock. If the search
acquires the lock before the add, it won't be found either.

I can even add an element that requires a search for one that has already
been added. In this case, I load the counters before the search. I search
the chain. If not found, I increment the counter and perform the add. If the
add fails, I have to re-drive the search.

I can also delete entries from the chain. When I find the entry to be
deleted, I save its previous entry. I can adjust the counts,  re-compute the
chain pointers and do a Compare and Swap and triple store to delete the
entry and add it to the fee chain. I can still search the chain but I'll
probably need to do a Compare and load to do so. I can avoid the PLO compare
and load by not actually deleting the cell but using the low half byte of
the active next pointer as a deleted flag. But that has disadvantages as
well. This also adds a little more logic to the add, since I now need to add
using the free chain if one exits or an add acquiring a new cell. 

There are a lot of details not given here for brevity. This example also
uses an unordered single linked list to simplify the example. But properly
designed PLO operations can be performed on ordered doubly linked lists as
well.  When I read the Principles of Operations on the Z/EC12 transactional
execution facility, I think strongly of a PLO on steroids. The point is that
PLO can almost be used exclusively for serialization. 

As far as overhead, I have done a lot of testing and the key is the proper
choice of the lock word and the algorithm. In my research, the throughput
advantages of PLO far outweigh its overhead. I would love some time with the
transactional execution facility. From my reading, it eliminates the need
for any serialization other than PLO or transactional execution. Though I
understand that IBM has chosen a redrive limit as the determining factor as
to whether to fall back to a lock. 

I believe the only limit to using PLO for serialization is the imagination.


Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Rob Scott
Sent: Monday, November 04, 2013 2:49 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

PLO CSDST and CSTST are *extremely* useful for queue and linked list
manipulation in multi-ASID multi-TCB environments. The key to their use is
to have a lock word counter that the caller increments and then prepares the
new values in other regs. When it comes time to actually atomically update
the lock word, you can redrive the structure manipulation logic if the CC
indicates that the lock word value has changed, otherwise the other fields
are updated atomically.

For actual practical uses, it is well worth putting all this inside some
sort of macro or small stub service as you do not want to have to code the
guts of it each time. 

I also think the uptake of PLO would be greater if there were some decent
example code in the manuals - for instance a client adding a request to the
tail of the queue whilst

Re: Serialization without Enque

2013-11-04 Thread Jon Perryman
Thanks Kenneth. Excellent example. I didn't consider that the load for the 
counter must be first to cause the serialization.

Jon Perryman.




 From: Kenneth Wilkerson redb...@austin.rr.com



I have used PLO almost exclusively for serialization in multi-address space,
multi-du code for almost 10 years. I use all 6 operations. Since everything
I write is 64 bit mode, I generally use the +2 variant (64 bit length) but I
like using the +3 variant (128 bit length) for some really cool stuff. As
Rob pointed out,  The key to their use is to have a lock word counter that
the caller increments and then prepares the new values in other regs.. But
I add that the real key is designing the application to use PLO for
serialization which is much more than just writing macros to do the guts of
the processes (though I almost always use macros).

Consider this example. I have a singly linked chain of 64 bit cell addresses
on quad word boundaries. The chain pointers are a  quad word at the start of
the cell. The first double word is the head of the active chain. The second
double word is the head of the free chain. The first quad word of each cell
is a double word pointer to the next active entry followed by a double word
pointer to the next free entry. The chain has a quad word counter on a quad
word boundary. The first double word of the counter is a change count. The
second double word is an active element count. The algorithm always adds new
cells to the head of the list. 

I can add  a new cell by using a LMG to load the counters, increment each
counter by 1, compute the new head, compute the old head's new previous and
then use a Compare and swap and double store 128 bit to add the new entry.
Since every update increments the first double word counter by 1, the
process only completes if no other process updated the counter. If the
counter has changed, it needs to re-drive. By adding entries to the head, I
can also have code simultaneously searching the chain while entries are
added. Of course, if the new head is added before the search starts, it
won't be found. But that's no different than using a lock. If the search
acquires the lock before the add, it won't be found either.

I can even add an element that requires a search for one that has already
been added. In this case, I load the counters before the search. I search
the chain. If not found, I increment the counter and perform the add. If the
add fails, I have to re-drive the search.

I can also delete entries from the chain. When I find the entry to be
deleted, I save its previous entry. I can adjust the counts,  re-compute the
chain pointers and do a Compare and Swap and triple store to delete the
entry and add it to the fee chain. I can still search the chain but I'll
probably need to do a Compare and load to do so. I can avoid the PLO compare
and load by not actually deleting the cell but using the low half byte of
the active next pointer as a deleted flag. But that has disadvantages as
well. This also adds a little more logic to the add, since I now need to add
using the free chain if one exits or an add acquiring a new cell. 

There are a lot of details not given here for brevity. This example also
uses an unordered single linked list to simplify the example. But properly
designed PLO operations can be performed on ordered doubly linked lists as
well.  When I read the Principles of Operations on the Z/EC12 transactional
execution facility, I think strongly of a PLO on steroids. The point is that
PLO can almost be used exclusively for serialization. 

As far as overhead, I have done a lot of testing and the key is the proper
choice of the lock word and the algorithm. In my research, the throughput
advantages of PLO far outweigh its overhead. I would love some time with the
transactional execution facility. From my reading, it eliminates the need
for any serialization other than PLO or transactional execution. Though I
understand that IBM has chosen a redrive limit as the determining factor as
to whether to fall back to a lock. 

I believe the only limit to using PLO for serialization is the imagination.


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Jon Perryman
Thanks Binyamin. Also a great example but it brings me to another question. 
What is the advantage of using PLO compare and fetch? Is it just saving CPU 
time in the case where the counter has changed? Is there another advantage that 
I'm not thinking about?

Jon Perryman.




 From: Binyamin Dissen bdis...@dissensoftware.com



If you truly need a triple compare and swap then PLO will not help you. But if
you need a disjoint double compare and swap, you use the compare-and-swap
field as a counter and then you con do a compare swap and double store.

Example:

     Fetch counter
A   PLO  compare-and-fetch value1
     CC0, go to A
     PLO  compare-and-fetch value 2
     CC0, go to A
     calculate new value1 and 2
     Add one to fetched counter
     PLO CSDST fetched-counter new-fetched-counter, new value1, new-value2
     CC0, go to A




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson
Yes, it is possible that the updates are not performed in any order.
However, it is guaranteed that the updates are only performed if the swap
can be done. Therefore, I use a simple rule. If the number of instructions
needed to compute the new chain pointers are small (as is the case in my
example). I don't incur the overhead of doing the extra 2 PLO (Compare and
Load) operations. I simply re-drive the operation as shown in Binyamin's
example. Even with the PLO Compare and Load, there is no guarantee the swap
will succeed. It just lessens the likelihood. So the decision point is
whether the overhead of 2 additional PLO instructions is less than the
overhead of a re-drive. This can only be determined with testing. You can
determine this by using a CS  to update a counter for every re-drive. You
already have an operation count, so you can then easily determine the
percentage of re-drives. In my experience, even in very active chains, the
PLO serialization process will incur a very small number of re-drives (much
less than 1 percent).  But only testing can reveal that. 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Binyamin Dissen
Sent: Monday, November 04, 2013 11:15 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

My understanding is with multi-threading it is possible that the updates to
the fields may be out of order and thus it is possible to fetch the updated
counter with the unupdated value1. PLO serializes it.

On Mon, 4 Nov 2013 07:46:51 -0800 Jon Perryman jperr...@pacbell.net wrote:

:Thanks Binyamin. Also a great example but it brings me to another
question. What is the advantage of using PLO compare and fetch? Is it just
saving CPU time in the case where the counter has changed? Is there another
advantage that I'm not thinking about?
:
:Jon Perryman.
:
:
:
:
: From: Binyamin Dissen bdis...@dissensoftware.com : : : :If you
truly need a triple compare and swap then PLO will not help you. But if
:you need a disjoint double compare and swap, you use the compare-and-swap
:field as a counter and then you con do a compare swap and double store.
:
:Example:
:
:     Fetch counter
:A   PLO  compare-and-fetch value1
:     CC0, go to A
:     PLO  compare-and-fetch value 2
:     CC0, go to A
:     calculate new value1 and 2
:     Add one to fetched counter
:     PLO CSDST fetched-counter new-fetched-counter, new value1,
new-value2 :     CC0, go to A : : : :
:--
:For IBM-MAIN subscribe / signoff / archive access instructions, :send
email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
Binyamin Dissen bdis...@dissensoftware.com http://www.dissensoftware.com

Director, Dissen Software, Bar  Grill - Israel


Should you use the mailblocks package and expect a response from me, you
should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems, especially
those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Binyamin Dissen
That won't help if you fetch the new count and the old value1.

On Mon, 4 Nov 2013 11:38:38 -0600 Kenneth Wilkerson redb...@austin.rr.com
wrote:

:Yes, it is possible that the updates are not performed in any order.
:However, it is guaranteed that the updates are only performed if the swap
:can be done. Therefore, I use a simple rule. If the number of instructions
:needed to compute the new chain pointers are small (as is the case in my
:example). I don't incur the overhead of doing the extra 2 PLO (Compare and
:Load) operations. I simply re-drive the operation as shown in Binyamin's
:example. Even with the PLO Compare and Load, there is no guarantee the swap
:will succeed. It just lessens the likelihood. So the decision point is
:whether the overhead of 2 additional PLO instructions is less than the
:overhead of a re-drive. This can only be determined with testing. You can
:determine this by using a CS  to update a counter for every re-drive. You
:already have an operation count, so you can then easily determine the
:percentage of re-drives. In my experience, even in very active chains, the
:PLO serialization process will incur a very small number of re-drives (much
:less than 1 percent).  But only testing can reveal that. 
:
:-Original Message-
:From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
:Behalf Of Binyamin Dissen
:Sent: Monday, November 04, 2013 11:15 AM
:To: IBM-MAIN@LISTSERV.UA.EDU
:Subject: Re: Serialization without Enque
:
:My understanding is with multi-threading it is possible that the updates to
:the fields may be out of order and thus it is possible to fetch the updated
:counter with the unupdated value1. PLO serializes it.
:
:On Mon, 4 Nov 2013 07:46:51 -0800 Jon Perryman jperr...@pacbell.net wrote:
:
::Thanks Binyamin. Also a great example but it brings me to another
:question. What is the advantage of using PLO compare and fetch? Is it just
:saving CPU time in the case where the counter has changed? Is there another
:advantage that I'm not thinking about?
::
::Jon Perryman.
::
::
::
::
:: From: Binyamin Dissen bdis...@dissensoftware.com : : : :If you
:truly need a triple compare and swap then PLO will not help you. But if
::you need a disjoint double compare and swap, you use the compare-and-swap
::field as a counter and then you con do a compare swap and double store.
::
::Example:
::
::     Fetch counter
::A   PLO  compare-and-fetch value1
::     CC0, go to A
::     PLO  compare-and-fetch value 2
::     CC0, go to A
::     calculate new value1 and 2
::     Add one to fetched counter
::     PLO CSDST fetched-counter new-fetched-counter, new value1,
:new-value2 :     CC0, go to A : : : :
::--
::For IBM-MAIN subscribe / signoff / archive access instructions, :send
:email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
Binyamin Dissen bdis...@dissensoftware.com
http://www.dissensoftware.com

Director, Dissen Software, Bar  Grill - Israel


Should you use the mailblocks package and expect a response from me,
you should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems,
especially those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson
I'm glad you brought that up because I knew what I have been doing for years
was correct but I hadn't taken the time to read the manual on PLO in some
time. The order of stores is unpredictable except that  according to the
POM, operand 2 (in this case, the count) is always stored last. 

 In those cases when a store is performed to the second-
operand location and one or more of the fourth-,
sixth-, and eighth-operand locations, the store to the
second-operand location is always performed last, as
observed by other CPUs and by channel programs. 

Page 7-290 right column top half of page in SA22-7832-09, 7-281 in
SA22-7832-08

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Binyamin Dissen
Sent: Monday, November 04, 2013 2:02 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

That won't help if you fetch the new count and the old value1.

On Mon, 4 Nov 2013 11:38:38 -0600 Kenneth Wilkerson redb...@austin.rr.com
wrote:

:Yes, it is possible that the updates are not performed in any order.
:However, it is guaranteed that the updates are only performed if the swap
:can be done. Therefore, I use a simple rule. If the number of instructions
:needed to compute the new chain pointers are small (as is the case in my
:example). I don't incur the overhead of doing the extra 2 PLO (Compare and
:Load) operations. I simply re-drive the operation as shown in Binyamin's
:example. Even with the PLO Compare and Load, there is no guarantee the
swap :will succeed. It just lessens the likelihood. So the decision point
is :whether the overhead of 2 additional PLO instructions is less than the
:overhead of a re-drive. This can only be determined with testing. You can
:determine this by using a CS  to update a counter for every re-drive. You
:already have an operation count, so you can then easily determine the
:percentage of re-drives. In my experience, even in very active chains, the
:PLO serialization process will incur a very small number of re-drives
(much :less than 1 percent).  But only testing can reveal that. 
:
:-Original Message-
:From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
:Behalf Of Binyamin Dissen
:Sent: Monday, November 04, 2013 11:15 AM
:To: IBM-MAIN@LISTSERV.UA.EDU
:Subject: Re: Serialization without Enque : :My understanding is with
multi-threading it is possible that the updates to :the fields may be out
of order and thus it is possible to fetch the updated :counter with the
unupdated value1. PLO serializes it.
:
:On Mon, 4 Nov 2013 07:46:51 -0800 Jon Perryman jperr...@pacbell.net
wrote:
:
::Thanks Binyamin. Also a great example but it brings me to another
:question. What is the advantage of using PLO compare and fetch? Is it just
:saving CPU time in the case where the counter has changed? Is there
another :advantage that I'm not thinking about?
::
::Jon Perryman.
::
::
::
::
:: From: Binyamin Dissen bdis...@dissensoftware.com : : : :If
you :truly need a triple compare and swap then PLO will not help you. But
if ::you need a disjoint double compare and swap, you use the
compare-and-swap ::field as a counter and then you con do a compare swap
and double store.
::
::Example:
::
::     Fetch counter
::A   PLO  compare-and-fetch value1
::     CC0, go to A
::     PLO  compare-and-fetch value 2 ::     CC0, go to A ::   
calculate new value1 and 2 ::     Add one to fetched counter ::   
PLO CSDST fetched-counter new-fetched-counter, new value1,
:new-value2 :     CC0, go to A : : : :
::--
::For IBM-MAIN subscribe / signoff / archive access instructions, :send
:email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
Binyamin Dissen bdis...@dissensoftware.com http://www.dissensoftware.com

Director, Dissen Software, Bar  Grill - Israel


Should you use the mailblocks package and expect a response from me, you
should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems, especially
those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Jon Perryman
Thanks for pointing out that it's required to do the PLO COMPARE against the 
counter and FETCH of the value otherwise there is no guarantee that value1 is 
consistent with the counter.

I'm also hearing you say that programs that reference more than a single word, 
must use PLO COMPARE and FETCH. In Kenneth's example where he uses PLO to save 
64 bit addresses (which is 2 words), he can't use LG to reference the 64 bit 
address otherwise he risks using high and low register values that do not 
match. Is that correct?

Jon Perryman.




 From: Binyamin Dissen bdis...@dissensoftware.com


That won't help if you fetch the new count and the old value1.

On Mon, 4 Nov 2013 11:38:38 -0600 Kenneth Wilkerson redb...@austin.rr.com
wrote:

:Yes, it is possible that the updates are not performed in any order.
:However, it is guaranteed that the updates are only performed if the swap
:can be done. Therefore, I use a simple rule. If the number of instructions
:needed to compute the new chain pointers are small (as is the case in my
:example). I don't incur the overhead of doing the extra 2 PLO (Compare and
:Load) operations. I simply re-drive the operation as shown in Binyamin's
:example. Even with the PLO Compare and Load, there is no guarantee the swap
:will succeed. It just lessens the likelihood. So the decision point is
:whether the overhead of 2 additional PLO instructions is less than the
:overhead of a re-drive. This can only be determined with testing. You can
:determine this by using a CS  to update a counter for every re-drive. You
:already have an operation count, so you can then easily determine the
:percentage of re-drives. In my experience, even in very active chains, the
:PLO serialization process will incur a very small number of re-drives (much
:less than 1 percent).  But only testing can reveal that. 
:
:-Original Message-
:From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
:Behalf Of Binyamin Dissen
:Sent: Monday, November 04, 2013 11:15 AM
:To: IBM-MAIN@LISTSERV.UA.EDU
:Subject: Re: Serialization without Enque
:
:My understanding is with multi-threading it is possible that the updates to
:the fields may be out of order and thus it is possible to fetch the updated
:counter with the unupdated value1. PLO serializes it.
:
:On Mon, 4 Nov 2013 07:46:51 -0800 Jon Perryman jperr...@pacbell.net wrote:
:
::Thanks Binyamin. Also a great example but it brings me to another
:question. What is the advantage of using PLO compare and fetch? Is it just
:saving CPU time in the case where the counter has changed? Is there another
:advantage that I'm not thinking about?
::
::Jon Perryman.
::
::
::
::
:: From: Binyamin Dissen bdis...@dissensoftware.com : : : :If you
:truly need a triple compare and swap then PLO will not help you. But if
::you need a disjoint double compare and swap, you use the compare-and-swap
::field as a counter and then you con do a compare swap and double store.
::
::Example:
::
::     Fetch counter
::A   PLO  compare-and-fetch value1
::     CC0, go to A
::     PLO  compare-and-fetch value 2
::     CC0, go to A
::     calculate new value1 and 2
::     Add one to fetched counter
::     PLO CSDST fetched-counter new-fetched-counter, new value1,
:new-value2 :     CC0, go to A : : : :
::--
::For IBM-MAIN subscribe / signoff / archive access instructions, :send
:email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
Binyamin Dissen bdis...@dissensoftware.com
http://www.dissensoftware.com

Director, Dissen Software, Bar  Grill - Israel


Should you use the mailblocks package and expect a response from me,
you should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems,
especially those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson
This is not correct. The choice to PLO compare and load is not required
since the count is always guaranteed to be swapped after the stores (my last
email). I only use PLO Compare and load for complex chain manipulations. But
do it if you want. The serialization performed by a PLO forces serialization
on the lock word for all processors. I try to Avoid it for situations where
a re-drive is less costly

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Monday, November 04, 2013 2:42 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Thanks for pointing out that it's required to do the PLO COMPARE against the
counter and FETCH of the value otherwise there is no guarantee that value1
is consistent with the counter.

I'm also hearing you say that programs that reference more than a single
word, must use PLO COMPARE and FETCH. In Kenneth's example where he uses PLO
to save 64 bit addresses (which is 2 words), he can't use LG to reference
the 64 bit address otherwise he risks using high and low register values
that do not match. Is that correct?

Jon Perryman.




 From: Binyamin Dissen bdis...@dissensoftware.com


That won't help if you fetch the new count and the old value1.

On Mon, 4 Nov 2013 11:38:38 -0600 Kenneth Wilkerson 
redb...@austin.rr.com
wrote:

:Yes, it is possible that the updates are not performed in any order.
:However, it is guaranteed that the updates are only performed if the 
swap :can be done. Therefore, I use a simple rule. If the number of 
instructions :needed to compute the new chain pointers are small (as 
is the case in my :example). I don't incur the overhead of doing the 
extra 2 PLO (Compare and
:Load) operations. I simply re-drive the operation as shown in 
Binyamin's :example. Even with the PLO Compare and Load, there is no 
guarantee the swap :will succeed. It just lessens the likelihood. So 
the decision point is :whether the overhead of 2 additional PLO 
instructions is less than the :overhead of a re-drive. This can only 
be determined with testing. You can :determine this by using a CS  to 
update a counter for every re-drive. You :already have an operation 
count, so you can then easily determine the :percentage of re-drives. 
In my experience, even in very active chains, the :PLO serialization 
process will incur a very small number of re-drives (much :less than 1
percent).  But only testing can reveal that.
:
:-Original Message-
:From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
On :Behalf Of Binyamin Dissen
:Sent: Monday, November 04, 2013 11:15 AM
:To: IBM-MAIN@LISTSERV.UA.EDU
:Subject: Re: Serialization without Enque : :My understanding is 
with multi-threading it is possible that the updates to :the fields 
may be out of order and thus it is possible to fetch the updated 
:counter with the unupdated value1. PLO serializes it.
:
:On Mon, 4 Nov 2013 07:46:51 -0800 Jon Perryman jperr...@pacbell.net
wrote:
:
::Thanks Binyamin. Also a great example but it brings me to another 
:question. What is the advantage of using PLO compare and fetch? Is it 
just :saving CPU time in the case where the counter has changed? Is 
there another :advantage that I'm not thinking about?
::
::Jon Perryman.
::
::
::
::
:: From: Binyamin Dissen bdis...@dissensoftware.com : : : 
:If you :truly need a triple compare and swap then PLO will not help 
you. But if ::you need a disjoint double compare and swap, you use 
the compare-and-swap ::field as a counter and then you con do a compare
swap and double store.
::
::Example:
::
::     Fetch counter
::A   PLO  compare-and-fetch value1
::     CC0, go to A
::     PLO  compare-and-fetch value 2 ::     CC0, go to A ::     
calculate new value1 and 2 ::     Add one to fetched counter ::     
PLO CSDST fetched-counter new-fetched-counter, new value1,
:new-value2 :     CC0, go to A : : : :
::---
--- ::For IBM-MAIN subscribe / signoff / archive access instructions, 
:send :email to lists...@listserv.ua.edu with the message: INFO 
IBM-MAIN

--
Binyamin Dissen bdis...@dissensoftware.com 
http://www.dissensoftware.com

Director, Dissen Software, Bar  Grill - Israel


Should you use the mailblocks package and expect a response from me, 
you should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems, 
especially those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send 
email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN




--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM

Re: Serialization without Enque

2013-11-04 Thread Jon Perryman
As you say, PLO only locks CPU's using the same PLO lock word. For other CPU's 
not using the lockword, it is consider multiple unique instructions. So in the 
case of the 64 bit address, PLO CSDST, it is considered  compare, store value1, 
store value2, store swap value. Although it's unlikely, it is possible for the 
LG instruction to occur after store value1 but before store value2. Or are the 
stores considered a single occurrance instruction to the other CPU's?

Thanks, Jon Peryman. 




 From: Kenneth Wilkerson redb...@austin.rr.com
To: IBM-MAIN@LISTSERV.UA.EDU 
Sent: Monday, November 4, 2013 1:06 PM
Subject: Re: Serialization without Enque
 

This is not correct. The choice to PLO compare and load is not required
since the count is always guaranteed to be swapped after the stores (my last
email). I only use PLO Compare and load for complex chain manipulations. But
do it if you want. The serialization performed by a PLO forces serialization
on the lock word for all processors. I try to Avoid it for situations where
a re-drive is less costly

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Monday, November 04, 2013 2:42 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Thanks for pointing out that it's required to do the PLO COMPARE against the
counter and FETCH of the value otherwise there is no guarantee that value1
is consistent with the counter.

I'm also hearing you say that programs that reference more than a single
word, must use PLO COMPARE and FETCH. In Kenneth's example where he uses PLO
to save 64 bit addresses (which is 2 words), he can't use LG to reference
the 64 bit address otherwise he risks using high and low register values
that do not match. Is that correct?

Jon Perryman.




 From: Binyamin Dissen bdis...@dissensoftware.com


That won't help if you fetch the new count and the old value1.

On Mon, 4 Nov 2013 11:38:38 -0600 Kenneth Wilkerson 
redb...@austin.rr.com
wrote:

:Yes, it is possible that the updates are not performed in any order.
:However, it is guaranteed that the updates are only performed if the 
swap :can be done. Therefore, I use a simple rule. If the number of 
instructions :needed to compute the new chain pointers are small (as 
is the case in my :example). I don't incur the overhead of doing the 
extra 2 PLO (Compare and
:Load) operations. I simply re-drive the operation as shown in 
Binyamin's :example. Even with the PLO Compare and Load, there is no 
guarantee the swap :will succeed. It just lessens the likelihood. So 
the decision point is :whether the overhead of 2 additional PLO 
instructions is less than the :overhead of a re-drive. This can only 
be determined with testing. You can :determine this by using a CS  to 
update a counter for every re-drive. You :already have an operation 
count, so you can then easily determine the :percentage of re-drives. 
In my experience, even in very active chains, the :PLO serialization 
process will incur a very small number of re-drives (much :less than 1
percent).  But only testing can reveal that.
:
:-Original Message-
:From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
On :Behalf Of Binyamin Dissen
:Sent: Monday, November 04, 2013 11:15 AM
:To: IBM-MAIN@LISTSERV.UA.EDU
:Subject: Re: Serialization without Enque : :My understanding is 
with multi-threading it is possible that the updates to :the fields 
may be out of order and thus it is possible to fetch the updated 
:counter with the unupdated value1. PLO serializes it.
:
:On Mon, 4 Nov 2013 07:46:51 -0800 Jon Perryman jperr...@pacbell.net
wrote:
:
::Thanks Binyamin. Also a great example but it brings me to another 
:question. What is the advantage of using PLO compare and fetch? Is it 
just :saving CPU time in the case where the counter has changed? Is 
there another :advantage that I'm not thinking about?
::
::Jon Perryman.
::
::
::
::
:: From: Binyamin Dissen bdis...@dissensoftware.com : : : 
:If you :truly need a triple compare and swap then PLO will not help 
you. But if ::you need a disjoint double compare and swap, you use 
the compare-and-swap ::field as a counter and then you con do a compare
swap and double store.
::
::Example:
::
::     Fetch counter
::A   PLO  compare-and-fetch value1
::     CC0, go to A
::     PLO  compare-and-fetch value 2 ::     CC0, go to A ::     
calculate new value1 and 2 ::     Add one to fetched counter ::     
PLO CSDST fetched-counter new-fetched-counter, new value1,
:new-value2 :     CC0, go to A : : : :
::---
--- ::For IBM-MAIN subscribe / signoff / archive access instructions, 
:send :email to lists...@listserv.ua.edu with the message: INFO 
IBM-MAIN

--
Binyamin Dissen bdis...@dissensoftware.com 
http://www.dissensoftware.com

Director

Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson
The order of stores is unpredictable except that  according to the POM,
operand 2 (in this case, the count) is always stored last. 

 In those cases when a store is performed to the second- operand location
and one or more of the fourth-, sixth-, and eighth-operand locations, the
store to the second-operand location is always performed last, as observed
by other CPUs and by channel programs. 

Page 7-290 right column top half of page in SA22-7832-09, 7-281 in
SA22-7832-08

So it's impossible for the count to be updated before the stores. I've been
using and relying on these techniques for years with exhaustive testing
under high workloads with re-drive statistics to help me decide the
algorithm that I use. It can't  hurt to do the PLO Compare and Load. It just
adds overhead that is probably more efficiently handled by a re-drive. But
it's up to you. I suggest you add redrive counter to your test case and see
for yourself. I would be extremely surprised if the re-drive percent were
ever higher than a small fraction of 1% no matter how hard you drove the
chain.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Monday, November 04, 2013 3:31 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

As you say, PLO only locks CPU's using the same PLO lock word. For other
CPU's not using the lockword, it is consider multiple unique instructions.
So in the case of the 64 bit address, PLO CSDST, it is considered  compare,
store value1, store value2, store swap value. Although it's unlikely, it is
possible for the LG instruction to occur after store value1 but before store
value2. Or are the stores considered a single occurrance instruction to the
other CPU's?

Thanks, Jon Peryman. 




 From: Kenneth Wilkerson redb...@austin.rr.com
To: IBM-MAIN@LISTSERV.UA.EDU
Sent: Monday, November 4, 2013 1:06 PM
Subject: Re: Serialization without Enque
 

This is not correct. The choice to PLO compare and load is not required 
since the count is always guaranteed to be swapped after the stores (my 
last email). I only use PLO Compare and load for complex chain 
manipulations. But do it if you want. The serialization performed by a 
PLO forces serialization on the lock word for all processors. I try to 
Avoid it for situations where a re-drive is less costly

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
On Behalf Of Jon Perryman
Sent: Monday, November 04, 2013 2:42 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Thanks for pointing out that it's required to do the PLO COMPARE 
against the counter and FETCH of the value otherwise there is no 
guarantee that value1 is consistent with the counter.

I'm also hearing you say that programs that reference more than a 
single word, must use PLO COMPARE and FETCH. In Kenneth's example where 
he uses PLO to save 64 bit addresses (which is 2 words), he can't use 
LG to reference the 64 bit address otherwise he risks using high and 
low register values that do not match. Is that correct?

Jon Perryman.




 From: Binyamin Dissen bdis...@dissensoftware.com


That won't help if you fetch the new count and the old value1.

On Mon, 4 Nov 2013 11:38:38 -0600 Kenneth Wilkerson 
redb...@austin.rr.com
wrote:

:Yes, it is possible that the updates are not performed in any order.
:However, it is guaranteed that the updates are only performed if the 
swap :can be done. Therefore, I use a simple rule. If the number of 
instructions :needed to compute the new chain pointers are small (as 
is the case in my :example). I don't incur the overhead of doing the 
extra 2 PLO (Compare and
:Load) operations. I simply re-drive the operation as shown in 
Binyamin's :example. Even with the PLO Compare and Load, there is no 
guarantee the swap :will succeed. It just lessens the likelihood. So 
the decision point is :whether the overhead of 2 additional PLO 
instructions is less than the :overhead of a re-drive. This can only 
be determined with testing. You can :determine this by using a CS  to 
update a counter for every re-drive. You :already have an operation 
count, so you can then easily determine the :percentage of re-drives.
In my experience, even in very active chains, the :PLO serialization 
process will incur a very small number of re-drives (much :less than 
1
percent).  But only testing can reveal that.
:
:-Original Message-
:From: IBM Mainframe Discussion List 
[mailto:IBM-MAIN@LISTSERV.UA.EDU] On :Behalf Of Binyamin Dissen
:Sent: Monday, November 04, 2013 11:15 AM
:To: IBM-MAIN@LISTSERV.UA.EDU
:Subject: Re: Serialization without Enque : :My understanding is 
with multi-threading it is possible that the updates to :the fields 
may be out of order and thus it is possible to fetch the updated 
:counter with the unupdated value1. PLO serializes

Re: Serialization without Enque

2013-11-03 Thread Jon Perryman
I sure missed that one with the locks. 

PLO CDS does exactly what is wanted.  It does 2 CS's within the locked 
instruction. 

PLO CSDST on the other hand only does a single CS followed by 2 ST's. Since 3 
separate load instructions (not under PLO control) are required when not in 
contiguous storage, there is not any method that will guarantee the 3 values 
are consistent with the others. A counter as suggested by Peter Relson won't 
help either for this same reason.


I can't think of a situation where PLO CSDST is useful. Can anyone describe a 
situation where it is useful?

Jon Perryman.




 From: Rob Scott rsc...@rocketsoftware.com



I think the OP stated that his code could hold locks - in which case the latch 
services cannot be used.


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-02 Thread Jon Perryman
As a last resort, a GRS latch could be used.

Jon Perryman.




 From: Tony Harminc t...@harminc.net



On 1 November 2013 16:22, Donald Likens dlik...@infosecinc.com wrote:
 I have a situation where I need to serialize processing and cannot use CDS 
 because the two addresses being updated cannot
 be next to each other (because I use CDS with these two addresses with other 
 addresses).

CDSG, perhaps?


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-02 Thread Rob Scott
I think the OP stated that his code could hold locks - in which case the latch 
services cannot be used.

Rob Scott
Lead Developer
Rocket Software
77 Fourth Avenue . Suite 100 . Waltham . MA 02451-1468 . USA
Tel: +1.781.684.2305
Email: rsc...@rs.com
Web: www.rocketsoftware.com

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Jon Perryman
Sent: 02 November 2013 19:30
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

As a last resort, a GRS latch could be used.

Jon Perryman.




 From: Tony Harminc t...@harminc.net



On 1 November 2013 16:22, Donald Likens dlik...@infosecinc.com wrote:
 I have a situation where I need to serialize processing and cannot 
 use CDS because the two addresses being updated cannot be next to each other 
 (because I use CDS with these two addresses with other addresses).

CDSG, perhaps?


--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-01 Thread Farley, Peter x23353
Have you considered whether any of the PLO functions would do the trick?  More 
expensive than CDS, but perhaps they would help?

HTH

Peter

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Donald Likens
Sent: Friday, November 01, 2013 4:23 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Serialization without Enque

My code must be able to run in SRB mode and with locks held. I have a situation 
where I need to serialize processing and cannot use CDS because the two 
addresses being updated cannot be next to each other (because I use CDS with 
these two addresses with other addresses). I have attempted to use a 
combination of CS instructions to resolve this problem but it does not work. I 
know this will work if I use a CMS lock but I am concerned about affecting the 
whole system. Any advice?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify us immediately by e-mail and delete the message and any 
attachments from your system.


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-01 Thread Rob Scott
PLO with the CSDST function (compare swap and double store)

Sent from my iPhone

 On 1 Nov 2013, at 20:22, Donald Likens dlik...@infosecinc.com wrote:
 
 My code must be able to run in SRB mode and with locks held. I have a 
 situation where I need to serialize processing and cannot use CDS because the 
 two addresses being updated cannot be next to each other (because I use CDS 
 with these two addresses with other addresses). I have attempted to use a 
 combination of CS instructions to resolve this problem but it does not work. 
 I know this will work if I use a CMS lock but I am concerned about affecting 
 the whole system. Any advice?
 
 --
 For IBM-MAIN subscribe / signoff / archive access instructions,
 send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Serialization without Enque

2013-11-01 Thread Tony Harminc
On 1 November 2013 16:22, Donald Likens dlik...@infosecinc.com wrote:
 I have a situation where I need to serialize processing and cannot use CDS 
 because the two addresses being updated cannot
 be next to each other (because I use CDS with these two addresses with other 
 addresses).

CDSG, perhaps?

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN