from:"\"Kenneth Wilkerson\""

Re: Fw: Dataspace versus common area above the bar

2014-01-21 Thread Kenneth Wilkerson

I have never found comparing instruction speeds to be a fair gauge of
performance. It's not the choice of instructions (unless the original
choices were very poor) that affect performance but algorithms. As has been
pointed out, I have never seen any evidence that converting an algorithm
using data spaces and alets to one using 64 bit instructions and shared
memory objects would result in any measurable (2+%as an example) difference
in performance. However, if the change afforded a way to significantly
reduce the working set size or a way to search less frequently, this can
often yield significant reductions in overhead. 

Some things are very difficult to quantify. For example, there is
significant argument over the advantages of transactional memory versus
locks. On the surface, locking is more efficient but at a cost to
throughput. Transactional memory can use more cycle but improve throughput.
So how do you quantify this?

Almost 30 years ago, I developed a non-traditional storage manager that does
not use chains. As a result, it does not experience storage fragmentation.
It's path length varies slightly from the 1st to the millionth call. As a
resut, it outperforms chained storage manager that require locks by many
factors. And as the number of calls grow, the performance factor increases. 

Again I have never seen significant gains from using the same algorithms and
simply changing the instructions. Whereas,  I have seen x-fold  performance
reductions by improving algorithms.

Kenneth


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jim Mulder
Sent: Tuesday, January 21, 2014 7:25 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Fw: Dataspace versus common area above the bar

> 
> AMODE does not affect performance.  Can you explain which instructions
> you think are faster than some functional equivalent, and why you
> think they are faster?
> 
> 
> and it may be that what we have here is a misunderstanding of my
> language.  Let me begin with a little history.  On System/360 models
> above the model 30, L was faster than LH because they had  [at least]
> four-byte fetch widths and had to 'throw away' half of what they
> fetched for LH.
> 
> In my experience, and I have made many measurements, the same
> principle continues to apply mutatis mutandis today.
> 
> I, for example, have a pair of assembly-language glb-seeking binary
> search routines
> that search the same table of quadword elements.  One of these
> routines is AMODE(31) and one AMODE(64).The table---The same
> assembled table is always used---contains 63 elements.   The usual 127
> searches are performed, each 256 times.  In the upshot the AMODE(64)
> routine is measurably, 2.1201%, faster.
> 
> I have performed similar tests using searches of ordered lists of
> 10(10)200 elements.  They are more addressing-intensive, and the
> superiority of the AMODE(64) routine increases almost linearly with
> table size, from 2.0897% for a list of 10 elements to 2.3311% for a
> list of 200 elements.
> 
> Now it may be that what you mean by "AMODE does not affect
> performance" is different from what I mean.  If so, I should be
> pleased to have you clarify the ways in which our uses of this word
> are different.

 From a hardware design engineer:

All hardware instructions perform at the same speed in 64-bit mode or 
31-bit mode.  I assume the AMODE(31) and AMODE(64) he is referring to
only affects the addressing mode, but the exact same instruction 
sequences are used in both cases. If different code sequences are being
used, then all bets are off.  My first statement applies to the 
exact same code sequence in 64-bit addressing mode versus 31-bit
addressing mode. A few millicoded instructions do have slightly 
different path lengths depending on addressing mode, but even that
is not common.


  If you can send me the listings of the exact code that you are
measuring, I might be able to analyze the difference that
you are measuring.

  There certainly have been cases over the years where 
some processors required extra cycles to perform operand extension,
especially when involves sign bit propagation.  For specific
instructions on a specific processor, I can ask the engineers if
that is the case (as long as it is a recent enough processor that 
the engineers are still here). 
 
Jim Mulder   z/OS System Test   IBM Corp.  Poughkeepsie,  NY

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dataspace versus common area above the bar

2014-01-20 Thread Kenneth Wilkerson

Because I've used memory objects for so long, I have not had a reason for
IARVSERV. I read both the description in the macro reference and in the
authorized assembler guide and there seems to be a ton of restrictions and
quirks (such as TPROT). The most notable restriction is the sharing limit of
16 pages for an unauthorized address space. However, this limit can be
changed. But because of the ESQA considerations created because the page
tables can map different virtual address for the shared pages, I'm not sure
what would be a practical limit.  It does appear to address guard and to
some extent page protection . It also offers the ability to share 31 bit
storage with 24 bit applications (a key point). 

Shared and common memory objects do not have any of  IARVSERV restrictions
and do not change my conclusion that performance is NOT the reason to
convert to a memory object. It's the advanced functionality. One reason I
use a common memory object is so I can avoid using CSA and SQA particularly
for code. With the 16 page restriction it would be impractical to share code
with  IARVSERV. And common data spaces cannot execute code. There are no
limits to the flexibility offered by memory objects. I can share any number
of pages. With shared memory objects I can determine which address spaces
have access and which do not. With common memory I can create my own CSA and
even SQA with some restrictions.

As Jim affirmed, there is probably little if any performance difference
between data spaces and memory objects. Chose the one best suited to your
architecture.

Kenneth


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jim Mulder
Sent: Monday, January 20, 2014 4:13 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Fw: Dataspace versus common area above the bar

>> Memory objects are much more flexible than data spaces. Data spaces 
>> are limited to 2GB. Memory objects are only limited by the auxiliary
storage.
>> Memory objects can be guarded and can also be page protected. Data
spaces
>> cannot. Code can execute in memory object but not in data spaces. I
started
>> using memory objects 10 years ago and have nearly forgotten how to 
>> use
a
>> data space. 

>  Guard pages and protected pages can be created in data spaces using 
>IARV64  with TAGET_VIEW=HIDDEN  and TARGET_VIEW=READONLY

 I meant IARVSERV, not IARV64 

Jim Mulder   z/OS System Test   IBM Corp.  Poughkeepsie,  NY

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dataspace versus common area above the bar

2014-01-20 Thread Kenneth Wilkerson

Almost 10 years ago, I converted an application using 7 data spaces into one
using a single shared memory object. As Gord has pointed out, the CPU
advantage was negligible though I feel it is very difficult to benchmark the
effect. The real advantage was a reduction in error rates because of the
management of the 7 alets and the inadvertent use of an alet where it was
not needed. As far as isolation is concerned, memory objects are just as
isolated and much more manageable than data spaces. Basing isolation of a
common data space on the alet value is no more isolation than basing the
isolation of a common memory object on the amode.   

Memory objects are much more flexible than data spaces. Data spaces are
limited to 2GB. Memory objects are only limited by the auxiliary storage.
Memory objects can be guarded and can also be page protected. Data spaces
cannot. Code can execute in memory object but not in data spaces. I started
using memory objects 10 years ago and have nearly forgotten how to use a
data space. 

So the question is not a question of CPU performance but a question of do
you have an application that is architected or can be architected to take
advantage of the advanced features offered by memory objects? In my current
application, I use local, shared and common memory objects. I place most
(about 70%) of the code in one these common memory objects and page protect
them.  I can't think of any instance where I would chose a data space over a
memory object.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Gord Tomlin
Sent: Monday, January 20, 2014 10:23 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dataspace versus common area above the bar

On 2014-01-20 04:38, John Blythe Reid wrote:
> I just wanted to sound people out about converting a dataspace  to a 
> common area above the bar. The main interest is the effect it would 
> have on CPU usage.
>
> To put it into context, the dataspace is used for a set of tables 
> which are used by the application programs. There are around eight 
> thousand tables currently occupying about a gigabyte of virtual 
> storage. This is a large installation with excess of 700 million 
> transactions per month plus a heavy batch load. The application programs
make extensive use of these tables.
>
> Whenever an application program needs an element of one of the tables 
> it calls a standard assembler module which uses access register mode 
> to search the table in the dataspace and then returns the requested 
> element to the application program.
>
> If the set of tables were placed above the bar then access register 
> mode would not be needed as the tables would be directly addressable 
> in 64 bit addressing mode.
>
> It all seems much simpler so, at first sight, it would be expected to 
> use less CPU. A reduction in CPU would be the main justification for 
> doing the conversion.
>
> I would be very interested on anyone's opinion on this subject.
>
> Regards,
> John.

I did some tests of a very similar scenario, expecting to see a significant
performance gain. The actual results showed a reduction in CPU usage of
about 1-2%. We decided that the gain was small enough that we were better
off continuing to enjoy the data isolation provided by the data space.

--

Regards, Gord Tomlin
Action Software International
(a division of Mazda Computer Corporation)
Tel: (905) 470-7113, Fax: (905) 470-6507

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: APF authorization and JOBLIB DD card

2013-12-19 Thread Kenneth Wilkerson

> The short answer is that any module loaded by an authorized program 
> must come from an authorized library.

I've been reading this post with interest since I've had to do a lot to deal
with authorized services loading programs from unauthorized libraries. I
have a utility that copies the joblib/steplib information and the load
module information including its APF authorization from one address space
and transmits the information via SRB to another which can load a copy of an
unauthorized program (via IRB)  from an unauthorized library into another
address space for special testing. It uses the LOAD ADRNAPF which now also
has an ADRNAPF64 parameter. Of course, this requires that the utility
dynalloc the joblib/steplib in the IRB, open it, load, close it and unalloc
it.  It's a lot of code just to make a copy of a common program in another
address space.  The point being that an authorized program can load from an
unauthorized library provided it has the code to manage it. It doesn't need
to modify the APF setting for a library. Of course, the unauthorized program
is still setup to be called unauthorized. This is done for special debugging
functions used to isolate a common piece of code from other callers in other
address spaces.

Kenneth 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Gerhard Postpischil
Sent: Thursday, December 19, 2013 12:57 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: APF authorization and JOBLIB DD card

On 12/18/2013 7:58 PM, Blaicher, Christopher Y. wrote:
> The short answer is that any module loaded by an authorized program 
> must come from an authorized library.  Loaded modules don't have to be 
> authorized (AC=1), they just have to come from an authorized library.  
> Now it gets more complicated.

I solved this problem a long time ago. First on OS/360 by having a special
step account code, and on later (test) systems by having a utility program
that authorizes the tasklib, then loads the needed program(s). RACF can keep
it out of unwanted hands. It saves time and effort testing programs that
need authorization, and it also has a ZAP function for testing. It's heavily
modified code from Don Higgins that I found on the cbt tape, but I don't
remember what he called it; his version only has the ZAP capability. The
added code is:

  SPACE 1
APFSET   ICM   R7,15,TCBJLB  TEST STEPLIB PRESENCE
  BZAPFQUIT   NO STEPLIB
  USING IHADCB,R7 DECLARE IT
  L R7,DCBDEBAD   LOAD DEB FOR STEPLIB
  N R7,=X'00FF'  FIX HIGH BYTE
  USING DEBBASIC,R7
  OIDEBFLGS1,DEBAPFIN  TURN ON APF LIBRARY BIT

Gerhard Postpischil
Bradford, Vermont

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Intercept USS calls

2013-12-17 Thread Kenneth Wilkerson

Modifying the CVT to perform intercepts is definitely very easy but also
extremely risky. Modifying the CVT affects the entire system. All it takes
is the mishandling of a single caller, particularly one critical to an
address space and all hell breaks loose. I tried it once. I modified the PC
number in the SVT for a key system  PC.  A simple programming error caused
system wide havoc. I'll never do anything that has global system affects
again. Any intercept must be designed to provide isolation, at least for
testing.

On the other hand, PCs are managed at the address space level by
Z/Architecture.  So provided you have the capabilities to create the
necessary PC data structures required by the hardware in real, fixed
storage, you can intercept PC calls. It takes a lot of code and definitely
not recommended for faint of heart. Once a PC intercept is created, its
simple to pass the call to the original PC routine by simply branch entering
the original code with the state set by the PC call. You already have the
stacked entry . If you require both a front and back end intercept, this can
easily be accomplished by creating "bypass" PC definitions that mimic the
original Pc definition. 

But from experience, unless you're willing to write and debug a lot of code,
I'd get what I need from SMF.

Kenneth

From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Tony Harminc
Sent: Tuesday, December 17, 2013 2:25 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Intercept USS calls

On 17 December 2013 14:38, Don Poitras  wrote:
> I don't see why someone couldn't install their own table in place of 
> the pointed to by the CVT. See
>
> http://pic.dhe.ibm.com/infocenter/zos/v2r1/index.jsp?topic=%2Fcom.ibm.
> zos.v2r1.bpxb100%2Fbpx2cr_Example.htm

Sure - I agree that that's not hard. But, as with SVC screening, you have to
eventually pass control on to the real routine (or conceivably fail the call
or implement a different version yourself). If all you want to do is log the
calls, well it's probably not too hard, though you might have to be aware of
the caller's environment. If you want to do all this without introducing
security or integrity exposures, you may have to analyze each call you want
to capture. It may also be the case that some software "just knows" the PC
numbers for certain routines, and doesn't go through the CSR table at all.
Not a good practice, but I'd be surprised if it doesn't exist.

And who knows what recovery and repair there may be in the UNIX kernel, or
if those tables are dynamically updated as a matter of routine.

This would be fun to experiment with on your own private LPAR or zPDT, and
I'm not saying it can't or even shouldn't be done, but is anyone really
going to install such a change into their production systems?
That's why I said it falls into the "not for the faint of heart"
category.

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-14 Thread Kenneth Wilkerson

I'm with you on patents. I came across an IBM patent yesterday that was
dated 2009  describing a lock-free storage manager using cell pools  to
manage variable length storage. I invented my first one 30 years ago using
CAS. It was writing a more sophisticated version 10 years ago that led to my
research on PLO.  I'm now on my fourth version of this storage manager, this
one I wrote for me,  and it's much more sophisticated than the patented
algorithm with numerous more capabilities. 

Like music, every piece of software is based on what we've seen before.  So
my real question with software patents is how do you prove it's original?
It's the chicken and the egg. Which came first?

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of David Crayford
Sent: Thursday, November 14, 2013 6:40 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 14/11/2013 12:23 AM, Kenneth Wilkerson wrote:
> If I read the article you sent correctly, this algorithm is using a 
> spin lock. It has provision for implementing a lock-free algorithm but 
> none of those are detailed.

Most of the shared_ptr implementations in the wild, including z/OS, use a
lock-free algorithm invented by Alexander Terekhov (IBM). An atomic
reference counted smart pointer is a nice tool to have in your bag.

> PLO has no restrictions other than the limits of my imagination.

Or patents! I notice IBM have quite a few wrt PLO. This one for example
http://www.google.com.mx/patents/US6636883. The US patent office seems to
like patents related to multi-processing. Maybe they think it's novel.

Software patents have a funky smell!

>
> However, in the last sentence of the performance section, the authors 
> state "Note that we make no claim that the atomic access approach 
> unconditionally outperforms the others, merely that it may be best for 
> certain scenarios that can be encountered in practice.". I think this 
> is the key to lock-free implementations. I have generalized primitive 
> functions to perform most of my PLO operations. But I design and tune 
> each to the specific use. I understand the desire to provide 
> generalized functionality for high level language use. However, I do 
> not accept the premise that all lists are the same. And I would 
> certainly use different algorithms for lists that I expected to get 
> "large" and lists that have more than a small percentage of update
operations.

While it may not be true on z/OS, most software developers these days use
high level languages for multi-threaded applications and prefer abstractions
because they are easy to use and reason about. Of course, that doesn't mean
that they shouldn't understand what's happening under the covers.  The trick
is keeping the optimizer out of the mix which is where inline assembler
comes in handy.

There are many high-quality (free) libraries for lock-free data structures
that are relatively easy to port to z/OS. Using advanced language features
it's quite simple to configure a highly granular concurrent queue by using
policies. The difficult part is testing the bloody things!

typedef mpmc_queue< fixed_array, lock_free, smart_ptr > multi_queue; typedef
spsc_queue< priority_list, mutex, unique_ptr > blocking_queue; typedef
thread_pool< multi_queue, max_threads<8> > task_pool;

socket_server server;

> Kenneth
> --
> --
> 
>
>
> -Original Message-
> From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
> On Behalf Of David Crayford
> Sent: Tuesday, November 12, 2013 11:45 PM
> To: IBM-MAIN@LISTSERV.UA.EDU
> Subject: Re: Serialization without Enque
>
> On 13/11/2013 12:34 PM, Kenneth Wilkerson wrote:
>> Actually, the algorithm performs well for read-often, write-rarely 
>> list because the active chain count does not change and therefore 
>> there are relatively infrequent re-drives.  The active chain count 
>> only changes on an add or delete. So if there are infrequent adds and 
>> deletes, there will be infrequent re-drives. And you are wrong, 
>> readers will not contend unless two or more tasks are referencing the 
>> exact same element simultaneously. And even then, the re-drive only
> involves the update to the use count.
>
> Thanks,  I get it now. Maybe IBM should have used PLO for the z/OS C++ 
> shared_ptr SMR algorithm which has both weak/strong reference counts + 
> the pointer. They use a lock-free algorithm using CS 
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2674.htm.
> shared_ptr implements proxy based garbage collection.
>
>> There are a lot of optimization that I do not describ

Re: Serialization without Enque

2013-11-13 Thread Kenneth Wilkerson

If I read the article you sent correctly, this algorithm is using a spin
lock. It has provision for implementing a lock-free algorithm but none of
those are detailed. There are certainly cases where spin-locks work very
effectively particularly if the thread finds a lock is unavailable and it
voluntarily relinquishes control. I try to avoid spin-locks because if they
are performed in a high priority application they can cause detrimental
system effects. The applications I write are system level and can be
executed from anywhere meaning that applications that use it may be in a
locked stated, in an SRB, a high priority, etc thus meaning that it behooves
me to only use hardware provided methods for serialization. This is the
primary reason I use PLO. PLO has no restrictions other than the limits of
my imagination.

However, in the last sentence of the performance section, the authors state
"Note that we make no claim that the atomic access approach unconditionally
outperforms the others, merely that it may be best for certain scenarios
that can be encountered in practice.". I think this is the key to lock-free
implementations. I have generalized primitive functions to perform most of
my PLO operations. But I design and tune each to the specific use. I
understand the desire to provide generalized functionality for high level
language use. However, I do not accept the premise that all lists are the
same. And I would certainly use different algorithms for lists that I
expected to get "large" and lists that have more than a small percentage of
update operations.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of David Crayford
Sent: Tuesday, November 12, 2013 11:45 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 13/11/2013 12:34 PM, Kenneth Wilkerson wrote:
> Actually, the algorithm performs well for read-often, write-rarely 
> list because the active chain count does not change and therefore 
> there are relatively infrequent re-drives.  The active chain count 
> only changes on an add or delete. So if there are infrequent adds and 
> deletes, there will be infrequent re-drives. And you are wrong, 
> readers will not contend unless two or more tasks are referencing the 
> exact same element simultaneously. And even then, the re-drive only
involves the update to the use count.

Thanks,  I get it now. Maybe IBM should have used PLO for the z/OS C++
shared_ptr SMR algorithm which has both weak/strong reference counts + the
pointer. They use a lock-free algorithm using CS
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2674.htm. 
shared_ptr implements proxy based garbage collection.

> There are a lot of optimization that I do not describe in this 
> algorithm for simplicity. For example, when you do a PLO DCAS, the 
> condition code is set to indicate which compare failed. This can be used
to optimize the re-drive.
> There are many others optimizations with re-drives to make this more 
> efficient. And if you get away from traditional lists, there are even 
> more optimizations.

Like what? What do you replace linked lists with, arrays with offsets
instead of pointers?

> Honestly, I provided this algorithm after reading the paper on hazard 
> pointers. The paper was written in 2002 and claimed there was no 
> atomic DCAS when PLO DCAS  became available in 2001. So I took a much 
> simpler algorithm that I had and modified it to use a use count to 
> accommodate traditional storage managers to prove that a PLO could be 
> used to manage a conventional list using a traditional storage manager 
> and provide a SMR algorithm without the need for DU level management 
> structures. I don't use many conventional lists and I have a proprietary
storage manager that does not use chains.
> Most of my PLO operations are much simpler.

> I would love to test this algorithm against any other SMR algorithm. 
> My point has been and remains, that PLO can be efficiently used to 
> serialize any list in a lock-free manner and even if it does take more 
> CPU this will be offset by increased throughput.
>
> And just because UNIX has issues with PLO doesn't mean the issue is 
> with PLO...

UNIX doesn't have an issue with PLO. It clearly states that popping/pushing
elements at the beginning of the queue is a good performer. Surely your
algorithm would have the same problem if multiple producers/consumers were
inserting/removing elements from the middle of a long list.

> Kenneth
>
> -Original Message-
> From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
> On Behalf Of David Crayford
> Sent: Tuesday, November 12, 2013 8:39 PM
> To: IBM-MAIN@LISTSERV.UA.EDU
> Subject: Re:

Re: Serialization without Enque

2013-11-13 Thread Kenneth Wilkerson

The active count is incremented for every add and delete. It is never
decremented so any update would result in a change.  In my actual
algorithms, all my lists are in shared or common memory objects so all the
pointers are 64 bit and I use +2 variations on the PLO. In this case, I use
a counter with 2 full words on a double word boundary. The first full word
is the change count and it's always incremented. The second full word is
element count and it's is incremented for each add and decremented for each
deleted. I load the counter with a LG and then use ALG and AL or SL to
manipulate the high or low word.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Shmuel Metz (Seymour J.)
Sent: Wednesday, November 13, 2013 7:29 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

In <001801cee029$af0d2bc0$0d278340$@austin.rr.com>, on 11/12/2013
   at 10:34 PM, Kenneth Wilkerson  said:

>Actually, the algorithm performs well for read-often, write-rarely list 
>because the active chain count does not change and therefore there are 
>relatively infrequent re-drives.

What happens if the is an intervening add and also an intervening remove,
leaving no net change in the active chain count even though the chain itself
has changed?
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see <http://patriot.net/~shmuel/resume/brief.html>
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-12 Thread Kenneth Wilkerson

Actually, the algorithm performs well for read-often, write-rarely list
because the active chain count does not change and therefore there are
relatively infrequent re-drives.  The active chain count only changes on an
add or delete. So if there are infrequent adds and deletes, there will be
infrequent re-drives. And you are wrong, readers will not contend unless two
or more tasks are referencing the exact same element simultaneously. And
even then, the re-drive only involves the update to the use count. 

There are a lot of optimization that I do not describe in this algorithm for
simplicity. For example, when you do a PLO DCAS, the condition code is set
to indicate which compare failed. This can be used to optimize the re-drive.
There are many others optimizations with re-drives to make this more
efficient. And if you get away from traditional lists, there are even more
optimizations.

Honestly, I provided this algorithm after reading the paper on hazard
pointers. The paper was written in 2002 and claimed there was no atomic DCAS
when PLO DCAS  became available in 2001. So I took a much simpler algorithm
that I had and modified it to use a use count to accommodate traditional
storage managers to prove that a PLO could be used to manage a conventional
list using a traditional storage manager and provide a SMR algorithm without
the need for DU level management structures. I don't use many conventional
lists and I have a proprietary storage manager that does not use chains.
Most of my PLO operations are much simpler. 

I would love to test this algorithm against any other SMR algorithm. My
point has been and remains, that PLO can be efficiently used to serialize
any list in a lock-free manner and even if it does take more CPU this will
be offset by increased throughput.

And just because UNIX has issues with PLO doesn't mean the issue is with
PLO...

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of David Crayford
Sent: Tuesday, November 12, 2013 8:39 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Thanks for sharing your design Ken. It seems to me that PLO is best used for
data structures like double-ended queues where elements can be
inserted/removed from both ends of the queue atomically. In the case of a
read-often-write-rarely list with multiple readers that traverse the list it
doesn't seem optimal. Correct me if I'm wrong but readers will contend with
each other.

FYI, z/OS UNIX message queues can be configured to use PLO (with fallback to
a latch). The documentation states that inserting/removing from the middle
of a long list using PLO is a poor performer
http://pic.dhe.ibm.com/infocenter/zos/v1r12/topic/com.ibm.zos.r12.bpxb100/qc
t.htm#qct. 

There isn't a one-size-fits-all solution. It depends on the usage patterns.
Hopefully HTM will solve that. Depending on how Intels Haswell TSX
implementation performs in the wild we could see HTM in our cell phones as
early as next year.

On 12/11/2013 11:18 PM, Kenneth Wilkerson wrote:
> I use cell pools. I also use a proprietary storage manager that 
> doesn't use chains. These methodology offer me capabilities well 
> beyond those found in traditional methods. Much of what I do is based 
> on these capabilities, but the algorithms could easily be adapted to 
> use a conventional storage manager that uses chains.
>
> Here is an algorithm I use that  I've adopted for traditional storage 
> management. This algorithm will work for any list; LIFO,  FIFO, 
> ordered or other , and for deletion from the head, middle or tail.
>
> Setup: Allocate a word in each element. Bit 0 is one for all active 
> elements. Bit 1 is one for all elements pending deletion. Bit 2 is 
> `reserved for an extension to this algorithm (such as garbage 
> cleanup). The remaining bits are a use count allowing for many more DUs
than are supported by MVS.
>
> Note.: When PLO Compare and Swap (CS) or Double Compare and Swap 
> (DCAS) is referenced, the PLO uses the use count address as the lock 
> word. This will serialize all updates to the use counter for that 
> element. For the PLO Compare and Loads or PLO update, the lock word is the
active chain counter.
>
> To search:
> Step 1:  Use the PLO Compare and Load on the active chain counter to 
> search the chain as before.  If the PLO fails, re-drive the search.
>
> Step 2:  Before examining the element, increment the use count with a 
> PLO Double Compare and Swap (DCAS). Load the first register pair with 
> the current chain counter. The swap value will also be the current 
> chain counter. Essentially, we're using the active chain count to 
> serialize increments  to the use count to avoid accessing an area that 
> may have been released. The second register pair will contain the 
> current use count with

Re: Serialization without Enque

2013-11-12 Thread Kenneth Wilkerson

I use cell pools. I also use a proprietary storage manager that doesn't use
chains. These methodology offer me capabilities well beyond those found in
traditional methods. Much of what I do is based on these capabilities, but
the algorithms could easily be adapted to use a conventional storage manager
that uses chains. 

Here is an algorithm I use that  I've adopted for traditional storage
management. This algorithm will work for any list; LIFO,  FIFO, ordered or
other , and for deletion from the head, middle or tail. 

Setup: Allocate a word in each element. Bit 0 is one for all active
elements. Bit 1 is one for all elements pending deletion. Bit 2 is `reserved
for an extension to this algorithm (such as garbage cleanup). The remaining
bits are a use count allowing for many more DUs than are supported by MVS.

Note.: When PLO Compare and Swap (CS) or Double Compare and Swap (DCAS) is
referenced, the PLO uses the use count address as the lock word. This will
serialize all updates to the use counter for that element. For the PLO
Compare and Loads or PLO update, the lock word is the active chain counter.

To search:
Step 1:  Use the PLO Compare and Load on the active chain counter to search
the chain as before.  If the PLO fails, re-drive the search.

Step 2:  Before examining the element, increment the use count with a PLO
Double Compare and Swap (DCAS). Load the first register pair with the
current chain counter. The swap value will also be the current chain
counter. Essentially, were using the active chain count to serialize
increments  to the use count to avoid accessing an area that may have been
released. The second register pair will contain the current use count with a
swap value incremented by 1 using an AL to avoid resetting the high bit. If
the PLO DCAS fails, the previous PLO Compare and Load (Step 1) should be
re-driven.

Step 3: Use the PLO Compare and Load for the next element. Save the PLO
status and decrement the use count with a SL using PLO CS. We don't need a
DCAS because the use count is not 0 and this element can't be deleted.  If
this PLO CS fails, re-drive it. If PLO Compare and Load status is re-drive,
then before re-driving the search, check the use count (in the register used
to update it). If  Bit 1 (pending delete) is set and the use count is 0,
this task can release it.

To delete:  (this assumes the deleting task has already updated the use
count in the process of finding the element to delete)
Step1: Use a PLO update function to remove the element from the active chain
to avoid any future references. 

Step2:  If the PLO update  to remove the element fails,  decrement the use
count but do NOT set bit 1 (pending delete) using a PLO CS. If the PLO CS
fails, re-drive it . 

Step 3: If the PLO update to remove the element succeeds, decrement the use
count,  SET bit 1 (pending delete) and RESET Bit 0 (active bit) using a PL
CS. If the PLO CS fails, re-drive it .

Step 4: Whether the PLO update succeeded or failed, check the use count in
the register used to update it:
 If bit 1 (pending delete) is set and the use count is
not 0, then this task should exit . 
  If bit 1 (pending delete) is set and the use count is
0, then this task can release it.
 Otherwise, this task must re-drive the search to find
the element to be deleted   

You can work out the various scenarios yourself. But because the count is
incremented/decremented after a PLO Compare and Load or update, the status
of the PLO provides a decision point on whether an element may have been
deleted. Using the use count address as the lock word insures that all use
count updates for a specific use count occur serially. There are numerous
extensions to this algorithm that are more than I want to describe. Things
like adding pending deletes to a delete chain or having an asynchronous,
periodic garbage collection task handle the deletes.  

Kenneth
-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Monday, November 11, 2013 9:38 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Could you provide an insight to a design that would handle the situation
where an SSI program with non-trivial workload uses a very large list. This
list is normally semi-static but there can be periods of time where entries
are heavily deleted and added. Not freeing storage is an option because it
could be a significant amount of storage. 

Thanks, Jon Perryman.

>
> From: Tony Harminc 
>To: IBM-MAIN@LISTSERV.UA.EDU
>Sent: Monday, November 11, 2013 7:07 PM
>Subject: Re: Serialization without Enque
> 
>
>On 11 November 2013 20:15, Jon Perryman  wrote:
>>      L    R2,QUEUE
>>      L    R3,NEXT_ENTRY
>>     CS  R2,R3,QUEUE                    New queue head  While this 
>>seems bullet proof, it's not. If there is a long delay between between

Re: Serialization without Enque

2013-11-11 Thread Kenneth Wilkerson

>In PLO, the hardware locking occurs according to the lock word.

The POM is not specific about the lock word other than a transformation
occurs to generate a PLT logical address used to acquire a lock. However,
this does not affect its application. The key point is that 2 or more
processors executing PLO simultaneously can access or alter the values using
a  PLO instruction and the operation will occur as if one PLO followed the
other.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Monday, November 11, 2013 7:09 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

>suggests that Hardware Transaction Memory may not be the panacea we all 
>expect it to be, and in some cases may actually increase CPU

Of course it's true that if a transaction experiences too much contention
and resorts to its fallback path, you have used more CPU than if you went
directly to the fallback path. That is specifically why every use of
non-constrained transactions ought to do analysis to determine if it is even
theoretically beneficial.

What I don't see mentioned in the article is zEC12's constrained
transactions. By their very definition they need no fallback path.  That is
a huge benefit both in terms of complexity and development/test cost.

>In PLO, the hardware locking occurs according to the lock word. 

You seem to be assuming that PLO implementation actually is truly locked
according to the individual lock word. Maybe it is now. It definitely did
not used to be. The machine would decide how to map the
individually-specified lock word to (limited) hardware resources that were
the true serialization mechanism. It was not necessarily one to one.

>Transaction Memory sounds exciting but it's complex. IBM should put a 
>layer of abstraction on top with simple semantics.

Maybe it's me, but I don't really find TBEGIN...TEND complex compared to
other serializing techniques even when you factor in PPI while counting the
number of attempts before taking the fallback path. The instructions within
a transaction are typically less complex than the instructions you would
need without a transaction, if you could even accomplish what you're trying
to do outside of a transaction. For example, there is no need for CS, PLO.
Just more straightforward "compare", "store", etc.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-11 Thread Kenneth Wilkerson

I read the article. It's still based on a CAS which I don't necessarily 
consider simpler than PLO . In this article, it states " Each hazard pointer 
can be written only by its associated thread, but can be read by all threads." 
This is exactly the provision I state in my example. It's a key element of any 
serialization algorithm that allows concurrent lock-free operations. Hazard 
pointers seem like an elegant general solution. I'll have to explore it but 
I've not had issues with any of the solutions I've written using other methods. 
I write a solution appropriate to the usage of the serialized resource. The key 
is a rigid protocol that has to always be followed. 

In my e-mail last night I meant transactional execution facility not 
transactional event facility. I've thoroughly read the POM on this facility but 
have not to date had access to a processor to experiment with it. From my 
reading, the transactional execution facility (TM) , appears to be a way to 
implement almost all requirements for concurrent read/write access to a 
serialized resource. Essentially, from my understanding of the POM, it's a CAS 
on as many operands as necessary within a hardware defined limit. In my very 
complex application, there are a couple scenarios where I cannot use PLO to 
serialize access. I have no problem using multi-step PLO operations for 
serialization as long as the integrity of the resource is guaranteed after each 
step. For example, a delete that consists of a PLO to remove from the active 
chain and a separate PLO to add the removed element to a free chain. However, 
some operations are too complex for a PLO CAS and triple store even if the 
operands are organized in storage such that you can modify 128 bits at a time. 
In these cases, I use a gate word and a spin lock. When available, the gate is 
0. When in use the gate contains identification information for the gate owner. 
I very rarely have to use these. And if I had a TM like transactional execution 
facility, I would replace this spin lock with this facility. Normally, there 
are only a handful of instructions within the gate so this has never caused me 
any problems.

In a senses, all methods, LL/CS, TM, etc. are spin locks. If they don't 
succeed, try, try again. All methods that I know of, the hardware must perform 
a memory serialization function. I use PLO instead of CS not only because of 
the increased functionality such as modifying noncontiguous areas and being 
able to modify up to 4 128 bit areas but also because I believe the  PLO  lock 
word is an advantage. In all hardware serialization methods that I know of, a 
memory serialization function is required during the LL/CS or TM. These 
serializing functions can be expensive. CS is not granular and the 
serialization proceeds without regard to accesses by other CPUs meaning the 
overhead occurs whether the function succeeds or fails. The PLO lock word 
"gates" access by all processors using the same lock word thus reducing the 
total number of serializations performed by "stopping" a processor performing a 
PLO using the same lock word. This requires careful selection of lock words and 
use of the same lock word by all processes that read/write to the same  
resource. This advantage can be negated in a queue that has a substantial 
percentage of write operations compared to read operations because write 
operations will necessarily result in a PLO failure. I believe this is the 
disadvantage referred to in your first paper on TM by Paul McKenney. I suspect 
that since IBM is using TM to replace a lock and that in most cases the lock 
was used to serialize storage alterations. In this case, CPU would increase but 
so would throughput. I believe this is a classic example of trading the "less 
expensive" CPU resource for the "more expensive" throughput. 

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of David Crayford
Sent: Sunday, November 10, 2013 8:56 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 11/11/2013 10:36 AM, Kenneth Wilkerson wrote:
> I read the article. This article is about transactional event facility 
> introduced in z/EC-12 and not PLO which is an LL/CS. I wish I had access to a 
> z/EC-12 with the transactional event facility to play with it and compare it 
> to PLO. The transactional event facility is much more comprehensive  and not 
> as granular as a PLO. In PLO, the hardware locking occurs according to the 
> lock word.

Transaction Memory sounds exciting but it's complex. IBM should put a layer of 
abstraction on top with simple semantics.

> I've done a lot of testing with PLO. It can increase CPU, particularly in a 
> situation where updates are much higher percentage of the operations. But in 
> all applications that I'

Re: Serialization without Enque

2013-11-10 Thread Kenneth Wilkerson

I read the article. This article is about transactional event facility 
introduced in z/EC-12 and not PLO which is an LL/CS. I wish I had access to a 
z/EC-12 with the transactional event facility to play with it and compare it to 
PLO. The transactional event facility is much more comprehensive  and not as 
granular as a PLO. In PLO, the hardware locking occurs according to the lock 
word. 

I've done a lot of testing with PLO. It can increase CPU, particularly in a 
situation where updates are much higher percentage of the operations. But in 
all applications that I've tested, it's CPU overhead is offset by higher 
throughput. In a traditional locking method, tasks end up serializing to the 
lock. 

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of David Crayford
Sent: Sunday, November 10, 2013 6:50 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

On 11/11/2013 5:19 AM, Mark Zelden wrote:
> On Sat, 9 Nov 2013 19:47:35 GMT, esst...@juno.com  wrote:
>
>> I have been reading and following this thread sine PLO is not an instruction 
>> I use every day.
>> It would be nice if someone would actually post some working code using a 
>> PLO instruction, to illustrate how one would add an element to a queue and 
>> remove an element from a queue.
>>
>> Paul D'Angelo
> I've not been paying that close of attention, but I'm more curious 
> about what people did for these situations prior to PLO.

They used smart algorithms using the atomic instructions they had, like RCU 
http://en.wikipedia.org/wiki/Read-copy-update. It's interesting that I have 
never seen any use of the PLO instruction in the zLinux kernel code.

Paul McKenney, IBMs expert on these things, wrote a good article that suggests 
that Hardware Transaction Memory may not be the panacea we all expect it to be, 
and in some cases may actually increase CPU 
http://paulmck.livejournal.com/31285.html.

> Mark
> --
> Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS 
> mailto:m...@mzelden.com ITIL v3 Foundation Certified Mark's MVS 
> Utilities: http://www.mzelden.com/mvsutil.html
> Systems Programming expert at 
> http://search390.techtarget.com/ateExperts/
> --
> For IBM-MAIN subscribe / signoff / archive access instructions, send 
> email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-09 Thread Kenneth Wilkerson

For anyone interested. 

THIS IS AN EXAMPLE OF A THE COMPLETE PROCESS USED TO ADD A NEW

DISPATCHABLE UNIT (DU) ENTRY TO A QUEUE OF ACTIVE DUS. FOR SIMPLICITY,

A DU IS EQUIVALENT TO A TASK.

 

THIS QUEUE CAN BE SIMULTANEOUSLY SEARCHED, NEW DUS ADDED AND

TERMINATING DUS DELETED. ADDS AND DELETES ARE DONE VERY INFREQUENTLY

(COMPARED TO SEARCHES) AND ONLY BY THE DU ITSELF. DU ENTRIES FOR OTHER

DUS ARE NEVER ADDED OR DELETED FROM THE QUEUE BY ANOTHER DU.

 

THE ADD AND DELETE RULE IS VERY IMPORTANT BECAUSE ONLY THE DU CAN OWN

ITS ENTRY THUS INSURING THAT THE OWNER (CODE THAT REFERENCES THE ENTRY)

IS THE ONLY CODE THAT CAN ADD OR DELETE THAT ENTRY. REGARDLESS OF WHAT

TYPE OF QUEUE YOU ARE SERIALIZING WITH PLO, COMMON SENSE MUST APPLY. IF

DELETES WERE ALLOWED WITH AN ENTRY THAT COULD STILL BE REFERENCED,

THERE WILL BE UNDESIRABLE RESULTS. IN THOSE CASES. SUCH AS WORK QUEUE,

THE ENTRY SHOULD BE DELETED FROM THE ACTIVE CHAIN OR A MECHANISM BE

PROVIDED TO DEFINE OWNERSHIP THUS PREVENTING DELETES FROM "OWNED"

ENTRIES.

 

THE QUEUE POINTERS ARE KEPT IN COMMON STORAGE BUT NOT IN CSA. THIS CODE

USES EITHER SHARED OR COMMON MEMORY OBJECTS. WHEN THIS CODE EXECUTES,

IT IS GUARANTEED THAT THE MEMORY OBJECT IS AVAILABLE. THE FOLLOWING

STRUCTURE IS ALWAYS USED FOR ALL QUEUES LIKE THIS ONE:

 

QUEUE  ...

QUEUE__STARTDSD  START OF CELLS USED FOR QUEUE

*WHEN THE QUEUE IS INITIALIZED, THE FREE CHAIN IS 0 TO AVOID

*ADDING FREE ELEMNTS TO THE WORKING SET OF THIS CODE. INSTAED, THE

*HWM=START AND AVAILQ IS 0. CODE SHOW HOW THIS IS MANAGED

*

QUEUE_HWM  DSD  CURRENT HIGH WATER MARK OF CELLS

QUEUE_END  DSD  END OF CELLS USED FOR THIS QUEUE

*

QUEUE_HEAD DSD  CURRENT ACTIVRE HEAD

QUEUE_TAIL DSD  CURRENT ACTIVE TAIL

QUEUE_COUNTERS DS0D LOCK WORD AND COUNTERS

QUEUE_CHANGES  DSA  HIGH WORD IS CHANGES

QUEUE_ENTRIES  DSA  LOW WORD IS ENTRIES IN CHAIN

*

QUEUE_AVAILQ   DSD  FREE CHAIN IS SINGLY LINKED

QUEUE_AVCOUNTERS   DS0D LOCK WORD AND COUNTERS FOR FREE

QUEUE_AVCHANGESDSA  HIGH WORD IS CHANGES

QUEUE_AVENTRIESDSA  LOW WORD IS ENTRIES IN CHAIN

   ...

 

THE COUNTER IS A DOUBLEWORD CONSISTING OF 2 FULLWORD POINTERS. THE

COUNTER IS ALWAYS USED AS THE PLO LOCK WORD AND IS ALWAYS LOADED WITH A

LG (LOAD GRANDE). THE CHANGE COUNT IS ALWAYS THE HIGH WORD AND

INCREMENT WITH INCCHANGES CONSTANT IN THE PROGRAM CONSTANTS. THE ENTRY

COUNT IS ALWAYS THE LOW WORD AND INCREMENTED WITH AHI WHICH ONLY

AFFECTS THE LOW WORD. I DON'T USE ANY OF THE SPECIAL HIGH WORD INST OR

IMMEDIATE INSTR BECAUSE THEY ARE A FACILITY SO I KEEP THE CODE AT A

FAIRLY OLD FACILITY LEVEL

 

*  CONSTANT IN PROGRAM

DS 0D

INCCHANGES  DC F'1',F'0'

 

THIS IS A SAMPLE ELEMENT LAYOUT. ESSENTIALLY, I ALWAYS KEEP THE PREV

AND NEXT POINTERS AS THE FIRST DOUBLE WORDS IN AN ELEMENT. THE

REMAINING AREAS OF THE ELEMENT HAVE NO BEARING ON THIS PROCESS.

 

ELEMENT ...

ELEMENT_PTRSDS0XL16

ELEMENT_PREVDSD

ELEMENT_NEXTDSD

ELEMENT_DATADS...

...

ELEMENT_SIZEEQU   *-ELEMENT_PREV

 

 

 

ALL THE CODE IS REENTRANT AND THE FOLOWING WORK AREAS ARE REFERENCED:

*  IN WORKING STORAGE

PLOWORK DSXL144   NEED FOR COMPARE AND SWAP AND STORES

SEARCH_PTR  DSD   INITIALIZED TO 0 FOR START OF SEARCH

* SET TO 0 FOR ADD OR ENTRY TO BE DELETED

SEARCH_COUNTER  DSD   SET BY CALLER TO CURRENT COUNTER VALUE

 

SO HERE IS THE SAMPLE EXTRACTED FROM A WORKING ALGORITHM. I CAN'T PUT

THE ORIGINAL CODE IN THIS EXAMPLE FOR MANY REASONS. BUT I CAREFULLY

EXTRACTED FROM A WORKING ALGORIGHM AND MADE IT INTO THIS EXAMPLE.

THIS EXAMPLE SHOWS COMPARE AND LOAD, COMPARE DOUBLE AND SWAP AND

SWAP AND DOUBLE STORE. IF YOU'RE NOT FAMILIAR WITH PLO TYPE

SERIALIZATIONS, IT MAY TAKE A WHILE TO DIGEST ALL OF THIS.

 

I DON'T USE THE PLO MNEMONICS. I'VE BEEN CODING IT A VERY LONG TIME AND

I JUST CODE IT.

 

*  START OF A SERVICE CALL FROM A DU, SEE IF ITS ALREADY BEEN ADDED

*  KEY IS ASCB ADDRESS AND DU (TCB). THE ADD ONLY RUNS IN THE DU BEING

*  ADDED



*  PRIME THE SEARCH ARGUMENTS. SEARCH_VALUE HAS THE QUEUE_COUNTERS AT

*  START OF THE SEARCH AND THEY ARE NOT CHANGED UNLESS THE PROCESS HAS

*  TO BE REDRIVEN

SEARCH_REDRIVE  DS 0H

  XCSEARCH_PTR,SEARCH_PTR  PRIME TO START AT HEAD

  MVC   SEARCH_COUNTER,QUEUE_COUNTERS  GET STARTING COUNTER

*

*  SEARCH QUEUE FOR REQUIRED ELEMENT

SEARCH_LOOP DS 0H

  LAR1,QUEUE_COUNTERS  LOCK WORD

  LGR14,SEARCH_COUNTER LOAD CURRENT COUNTER

***

Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson

Yes, I was talking about all references using PLO. I was also assuming this
was a "work" queue where "deletion" from the chain was the methodology for
claiming ownership of the element. However, any serialization method that
performs deletion must also have a method for claiming ownership before an
element can be deleted. It doesn't matter whether deletion is an actual
release of storage or the placement of the element into a free chain. If any
process other than the owning process maintains a reference to an element
without claiming it, the problem exists whether you use locks, PLO, CS,
whatever. If the reference is nothing more than searching the chain, then
PLO Compare and Load can solve that. If the reference is more than that, the
problem is not storage overlays or 0c4s. The problem is a missing method for
ownership. If this is true for this application, the chain is only
serializable with a lock and the lock must be held throughout the period
where the element is referenced before the element can be safely deleted. 


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Friday, November 08, 2013 1:11 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

The storage overlay does not pertain to the PLO. It pertains to the entire
element not being immediately removed from any type of use. Just because you
removed the element from the chain does not mean it's not in use somewhere.
You can't even say how long the element may be in use (e.g. task does not
get any CPU because of CPU load or swapped out address space in
multi-address space serialization).

Jon Perryman.




>________
> From: Kenneth Wilkerson 
>
>
>
>A storage overlay cannot occur in a properly implemented PLO with a 
>counter as long as the counter is properly maintained with every 
>process incrementing it by 1. Even in in a free chain implementation, 
>an improper PLO sequence can result in a circular or broken chain.
>
>-Original Message-
>Behalf Of Jon Perryman
>
>This specific paragraph from Peter is about "FREE QUEUE PROTOCOL".
>

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson

A storage overlay cannot occur in a properly implemented PLO with a counter
as long as the counter is properly maintained with every process
incrementing it by 1. Even in in a free chain implementation, an improper
PLO sequence can result in a circular or broken chain.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Friday, November 08, 2013 11:44 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

This specific paragraph from Peter is about "FREE QUEUE PROTOCOL". This is
where elements on your chain are no longer needed. Peter recommends not
freeing the element. Instead, you should use a queue of free elements that
you reuse when you need a new element.

Peter's concern is not the chaining. It's with the data each element
represents and making sure it remains consistent in a multi-processing
environment. A program using the element is not guaranteed the element
hasn't been freed and possibly re-allocated. PLO only ensures serialization
of the chain.  To ensure the element is valid, header validation is not a
enough. You should maintain a validation field that is occasionally
verified. This greatly reduces the exposure but does not completely
eliminate it. 

Peter's concern is valid and justified. S0C4 abends are visible so they can
be dealt with. The real problem is storage overlays that are not immediately
apparent. Even worse is when they affect unrelated programs.

Jon Perryman.
  



>
> From: Donald Likens 
>
>
>Thank You for your help (all of you) but Peter's statement below does not
make sense to me (maybe because I don't understand something).
>
>The reason that the free queue protocol needs a sequence number is 
>because even if the header "matches", the values that you need to put 
>into your new element for the "next" and/or "previous" may no longer be 
>correct due to a set of additional asynchronous operations.
>
>I use PLO to add an element to a chain. The chain only has forward
pointers. I always add to the end of the chain. I use storage (not CPOOL) to
get the element. It turns out that I haven't gotten the PLO instruction to
work properly yet but in theory in my scenario it seem to me that if the
pointer to the last element is pointing to a element (Not 0) I should be
able to store the next pointer and the last pointer in one PLO CSDST. Here
is the actual (not working code... It is not updating the chain properly):
>
>CSAMSEGL <== Last element on the chain
>CSAMSEGF <== First element on the chain
>R8 Address of element to add.
>MSEGNEXT <== Pointer to next element in last control block MSEGCB <== 
>element name
>
>DOX078   DS    0H *C     IF CSAMSEGL EQ 0 THEN    (no elements on the 
>chain)
>         XR    R4,R4
>         XR    R5,R5
>         LR    R2,R8    (R8: ADDRESS OF MSEGCB)
>         LR    R3,R8 *C       SET CSAMSEGL = CSAMSEGF = MSEGCB JUST 
>BUILT
>         CDS   R4,R2,CSAMSEGF IF CSAMSEGF & CSAMSEGL = 0, STM 
>2,3,CSAMSEGF
>         BC    4,ELIFX076 *C     ELSE
>         B     EIFX076
>ELIFX076 DS    0H *C IF CSAMSEGL = CSAMSEGL (R2) *C   SET CSAMSEGL = 
>POINTER_TO_NEW_MSEG (R8) *C   SET MSEGNEXT = POINTER_TO_NEW_MSEG (R8) 
>CSDST    EQU   16
>         L      R2,CSAMSEGL
>         LA    R0,CSDST
>         LA    R1,PLT
>         LR    R3,R8
>         LA    R4,CSAMSEGL   CSAMSEGL IS IN CSA
>         ST    R4,OPERAND6
>         LA    R4,MSEGNEXT   MSEGNEXT IS IN CSA
>         ST    R4,OPERAND4
>         ST    R8,OPERAND3
>         ST    R8,OPERAND5
>         PLO   R2,CSAMSEGL,0,PL
>*  THE FIRST-OPERAND COMPARISON VALUE AND THE SECOND OPERAND ARE
>*  COMPARED.  IF THEY ARE EQUAL, THE FIRST-OPERAND REPLACEMENT VALUE
>*  IS STORED AT THE SECOND-OPERAND LOCATION, AND THE THIRD OPERAND IS
>*  STORED AT THE FOURTHOPERAND LOCATION. THEN, THE FIFTH OPERAND IS
>*  STORED AT THE SIXTH-OPERAND LOCATION.
>         BNZ   DOX078
>EIFX076  DS    0H
>  
>PLT      DS    D        PLO LOCK TOKEN PL       DS    0F       
>PARAMETER LIST
>         ORG   PL+60
>OPERAND3 DS    A        NEW MSEG ADDRESS
>         ORG   PL+76
>OPERAND4 DS    A        ADDRESS OF CSAMSEGL
>         ORG   PL+92
>OPERAND5 DS    A        NEW MSEG ADDRESS
>         ORG   PL+108
>OPERAND6 DS    A        ADDRESS OF MSEGNEXT
>
>--
>For IBM-MAIN subscribe / signoff / archive access instructions, send 
>email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
>
>
>

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson

First, I'm not sure why you have chosen PLT as your lock word. It's very 
important the lock word resolve to the same REAL address no matter where the 
PLO executes. Since you are talking about multiple operations against the same 
chain, unless all the processes exist in the same shared program you can't be 
sure the same lock word is always used. The lock word contents are not altered 
just used to create a lock value(PLT) for serialization. From your code, I 
would chose CSAMSEGF.

 Second, you can't mix the CDS with PLO and expect consistent results. It would 
be very easy to convert the CDS to a PLO Compare Double and Swap (compare each 
full word and swap). Just be sure to use the same lock word which I still 
suggest as CSAMSEGF.

Third, you're going to have to use a count. And since you are acquiring and 
releasing this storage, you REALLY have to use a counter. Just add a full word 
counter in CSA. In this case, I would choose the counter as the lock word. As 
long as you have a full word counter and recovery to treat a 0c4 as a re-drive 
(see my prior comments), this should work. I don't know if you can, but the 
choice of another methodology where storage is not acquired for each add and 
released for each delete would be recommended. However, you can make this work 
but not without a counter. 

Last, Peter's comment is very valid. Read the notes in CDS in the POM. I don't 
know why PLO doesn't reference them since they are just as applicable to PLO.

 I have not closely examined your logic so I don't know what the logic problem 
is in the code. I'm just commenting on the methodology.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Donald Likens
Sent: Friday, November 08, 2013 10:22 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Thank You for your help (all of you) but Peter's statement below does not make 
sense to me (maybe because I don't understand something).

The reason that the free queue protocol needs a sequence number is because even 
if the header "matches", the values that you need to put into your new element 
for the "next" and/or "previous" may no longer be correct due to a set of 
additional asynchronous operations.

I use PLO to add an element to a chain. The chain only has forward pointers. I 
always add to the end of the chain. I use storage (not CPOOL) to get the 
element. It turns out that I haven't gotten the PLO instruction to work 
properly yet but in theory in my scenario it seem to me that if the pointer to 
the last element is pointing to a element (Not 0) I should be able to store the 
next pointer and the last pointer in one PLO CSDST. Here is the actual (not 
working code... It is not updating the chain properly):

CSAMSEGL <== Last element on the chain
CSAMSEGF <== First element on the chain
R8 Address of element to add.
MSEGNEXT <== Pointer to next element in last control block MSEGCB <== element 
name

DOX078   DS0H 
*C IF CSAMSEGL EQ 0 THEN(no elements on the chain)  

 XRR4,R4  
 XRR5,R5  
 LRR2,R8(R8: ADDRESS OF MSEGCB)   
 LRR3,R8  
*C   SET CSAMSEGL = CSAMSEGF = MSEGCB JUST BUILT  
 CDS   R4,R2,CSAMSEGF IF CSAMSEGF & CSAMSEGL = 0, STM 2,3,CSAMSEGF  
 BC4,ELIFX076 
*C ELSE   
 B EIFX076
ELIFX076 DS0H 
*C IF CSAMSEGL = CSAMSEGL (R2) 
*C   SET CSAMSEGL = POINTER_TO_NEW_MSEG (R8)   
*C   SET MSEGNEXT = POINTER_TO_NEW_MSEG (R8)   
CSDSTEQU   16  
 L  R2,CSAMSEGL  
 LAR0,CSDST 
 LAR1,PLT   
 
 LRR3,R8   
 LAR4,CSAMSEGL   CSAMSEGL IS IN CSA
 STR4,OPERAND6 
 LAR4,MSEGNEXT   MSEGNEXT IS IN CSA
 STR4,OPERAND4 
 STR8,OPERAND3 
 STR8,OPERAND5 
 PLO   R2,CSAMSEGL,0,PL
*  THE FIRST-OPERAND COMPARISON VALUE AND THE SECOND OPERAND ARE
*  COMPARED.  IF THEY ARE EQUAL, THE FIRST-OPERAND REPLACEMENT VALUE
*  IS STORED AT THE SECOND-OPERAND LOCATION, AND THE THIRD OPERAND IS   
*  STORED AT THE FOURTHOPERAND LOCATION. THEN, THE FIFT

Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson

-Original Message-
From: Kenneth Wilkerson [mailto:redb...@austin.rr.com] 
Sent: Friday, November 08, 2013 8:46 AM
To: 'IBM Mainframe Discussion List'
Subject: RE: Serialization without Enque

>I really don't see the big deal with an 0c4 in this scenario (should
happen rarely)

You misunderstood my point. You could use PLO to serialize a chain even if
the areas are released as they are deleted provided you always use PLO
Compare and Load to load the pointers and recovery sets a retry to re-drive
the process. As long as the count is updated each time you do a delete (and
release), if the delete occurs while some other management function is being
performed, the PLO Compare and Load will force the redrive either by CC or
0c4. If the storage had been reallocated but still in the same key, the PLO
would fail because the count has changed. PLO may fetch operands before the
lock and memory serialization so an 0c4 can occur on the PLO Compare and
Load. Either way, the PLO does not store so there is never an overlay.

I would never design an application to use PLO in this fashion. However, if
I had an existing application, I find this methodology more desirable than
getting a CMS lock everywhere I access the chain. I stand by my statement
that you can serialize 99+% of all scenarios using PLO and that P:LO
serialization is much more desirable than locks.  If this were not the case,
why bother creating the transactional execution facility?

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Friday, November 08, 2013 8:03 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

>applicable to 99%+ of all serialization scenarios you encounter

To be frank, you might not have very complex serialization requirements. 
Also, using PLO when CS,CSG,CDS,CDSG would do is a significant waste of
cycles. For the cases I have seen within our code, uses of PLO (in the cases
where it is not better to use something simpler) are a tiny percentage of
our serialization needs.

>When the updating process wakes up S0C4! 

Using PLO to update a free queue, as is the case with CPOOL and its
CDS-based free-queue protocol, requires that the queue elements *never* be
freed (unless you like potentially blowing up or, worse, overlaying
something you didn't intend to write into). Perhaps this is not well
understood.

>I really don't see the big deal with an 0c4 in this scenario (should
happen rarely)

That's a scary statement. If you get an 0C4 you could probably deal with it.
The real risk is the case where you don't get an 0C4 because the storage was
re-allocated and used for something else and now it does not program check
but overlays something.

>I think I figured out a solution:

There are a lot of details missing in what was shown, but if you want my
honest suspicion, it's that if this is a "free queue" type of protocol, it
will not work.
The reason that the free queue protocol needs a sequence number is because
even if the header "matches", the values that you need to put into your new
element for the "next" and/or "previous" may no longer be correct due to a
set of additional asynchronous operations.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-08 Thread Kenneth Wilkerson

>I really don't see the big deal with an 0c4 in this scenario (should
happen rarely)

You misunderstood my point. You could use PLO to serialize a chain even if
the areas are released as they are deleted provided you always use PLO
Compare and Load to load the pointers and recovery sets a retry to re-drive
the process. As long as the count is updated each time you do a delete (and
release), if the delete occurs while some other management function is being
performed, the PLO Compare and Load will force the redrive either by CC or
0c4. If the storage had been reallocated but still in the same key, the PLO
would fail because the count has changed. PLO may fetch operands before the
lock and memory serialization so an 0c4 can occur on the PLO Compare and
Load. Either way, the PLO does not store so there is never an overlay.

I would never design an application to use PLO in this fashion. However, if
I had an existing application, I find this methodology more desirable than
getting a CMS lock everywhere I access the chain. I stand by my statement
that you can serialize 99+% of all scenarios using PLO and that P:LO
serialization is much more desirable than locks.  If this were not the case,
why bother creating the transactional execution facility?

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Friday, November 08, 2013 8:03 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

>applicable to 99%+ of all serialization scenarios you encounter

To be frank, you might not have very complex serialization requirements. 
Also, using PLO when CS,CSG,CDS,CDSG would do is a significant waste of
cycles. For the cases I have seen within our code, uses of PLO (in the cases
where it is not better to use something simpler) are a tiny percentage of
our serialization needs.

>When the updating process wakes up S0C4! 

Using PLO to update a free queue, as is the case with CPOOL and its
CDS-based free-queue protocol, requires that the queue elements *never* be
freed (unless you like potentially blowing up or, worse, overlaying
something you didn't intend to write into). Perhaps this is not well
understood.

>I really don't see the big deal with an 0c4 in this scenario (should
happen rarely)

That's a scary statement. If you get an 0C4 you could probably deal with 
it. The real risk is the case where you don't get an 0C4 because the 
storage was re-allocated and used for something else and now it does not 
program check but overlays something.

>I think I figured out a solution:

There are a lot of details missing in what was shown, but if you want my 
honest suspicion, it's that if this is a "free queue" type of protocol, it 
will not work.
The reason that the free queue protocol needs a sequence number is because 
even if the header "matches", the values that you need to put into your 
new element for the "next" and/or "previous" may no longer be correct due 
to a set of additional asynchronous operations.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-07 Thread Kenneth Wilkerson

If your application is not designed to use PLO for serialization, it'll 
definitely not work for you. I use PLO for serialization because of issues with 
locks that you are describing (system affects) and many others. All my code can 
run as SRBs but unlike what you describe I almost never acquire locks. I always 
use PLO except when interfacing with MVS services that require locks which I do 
rarely.  Besides you can't mix CS with PLO meaning you would have to convert 
every CS affecting that chain. Sounds to me like you're stuck with the CMS 
lock. And if it’s a process done frequently, it will have a system impact; a 
significantly greater impact than a PLO serialized one. 

The thing I don't understand is you're statement "My problem is that a process 
comes in and removes the control block chain while another process is suspended 
and attempting to update the chain. When the updating process wakes up S0C4!" 
If you mean that you're releasing the storage for the chain, the process doing 
the release could use a PLO to set the chain pointers to 0 and in that process 
update the swap word. The second process will then get a CC forcing re-drive 
and it'll discover the chain is  now 0. In that case, that would probably mean 
that you would need to use PLO Compare and loads for every reference to those 
chain pointers or (my preference) you would have a retry point detect the 0c4 
and realize the chain has been released and just continue . I really don't see 
the big deal with an 0c4 in this scenario (should happen rarely). The PLO 
process does not update anything until the PLO executes so no harm no foul. 
Again, if the application is not designed to use PLO, it won't work. 

And you're not slow. When I first started using PLO almost 10 years ago. I 
spent days writing test cases with special traces so I could see what was 
happening. The whole time I had the POM open to the PLO instruction reading it 
over and over. Now, I consistently code PLO without much thought. If you chose 
to spend time learning the PLO, I think you'll find it vastly superior to locks 
(no restrictions) and applicable to 99%+ of all serialization scenarios you 
encounter provided you design the application to use PLO from the start.

Kenneth 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Donald Likens
Sent: Thursday, November 07, 2013 8:33 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

It has taken me this long to mostly understand PLO... I must be slow.

Now that I understand it (mostly) I am pretty sure it will not work for me. My 
problem is that a process comes in and removes the control block chain while 
another process is suspended and attempting to update the chain. When the 
updating process wakes up S0C4! That is why I was looking at using locks. If 
the process updating the chain holds a lock and the process removing the chain 
needs that lock to update the pointers this would not happen. So back to my 
original question:

My code must be able to run in SRB mode and with locks held. I have a situation 
where I need to serialize processing and cannot use CDS because the two 
addresses being updated cannot be next to each other (because I use CDS with 
these two addresses with other addresses). I have attempted to use a 
combination of CS instructions to resolve this problem but it does not work. I 
know this will work if I use a CMS lock but I am concerned about affecting the 
whole system. Any advice?   

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-06 Thread Kenneth Wilkerson

Thank you for mentioning the issue with CS/CDS. I have always understood
that if you use PLO anywhere  to serialize access to an area,  you must use
it everywhere to serialize access to that area.  It's nice to know that the
transactional facility serializes it against CS as well. I wish I had access
to a processor to play with it.

I always understood that the reason PLO did a memory serialization at the
end of a PLO was to insure that any processors that were referencing the
swap word(s) in a compare and swap would get consistent results; either the
swap value(32, 64 or 128) before the swap or after the  swap. Since the swap
value is always stored last and provided you always loaded the swap value
first, you would either get the correct swap and store values and the
subsequent PLO would succeed or you would get an outdated swap value with
inconsistent store values and the subsequent PLO would return a CC to
re-drive. I have written software traces for PLO and my observations have
supported this understanding of the POM's description of PLO and memory
serialization. Is this your understanding?

I often use 128 bit operations with consecutive words and double words to
perform some sophisticated operations. I've understood about double word
consistency. I've always assumed that PLO 128 bit required quad-word
alignment for 128 bit operations for the same reason. I always load the swap
value (in any flavor compare and swap and store) first. I always use a LMG
(even if its consecutive words) and I always insure that the primary counter
is in the first double word. I've read the POM many times looking for any
references to quad-word consistency. It doesn't really matter because of the
way I design PLO compare and swaps but I was wondering if there was
quad-word consistency as well?

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Wednesday, November 06, 2013 6:37 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

One of the shortcomings of PLO (unlike TBEGIN(C) ) is that PLO in general
serializes only against other uses of PLO. It does not serialize against CS
on the same storage, for example.

However, cache considerations and doubleword consistency still come into
play.

A LM of 2 words of a doubleword is done with what is referred to as
"doubleword consistency". That matters. 
If you need to load two consecutive words and you can arrange that those two
words are in the same doubleword, it can be to your advantage.
It's why in a doubleword serialized by CDS you do not typically (if ever)
want to load both of the individual words consecutively (as you might get
results that come half from one CDS and half from another CDS); you want to
use LM.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-05 Thread Kenneth Wilkerson

This is a complicated question that is very dependent on the design of your
application. So if the design needs to use a PLO Compare and Load, by all
means do so. I try to design these applications to avoid as many PLO Compare
and Loads as possible. The memory serialization (once at the start and once
at the end) are expensive but much less so than software locks. This means
that I design the update operations so that loading the pointers during a
PLO operation will; at worse, simply result in outdated information. As I
explained in my first example, this is possible regardless of whether you
use PLO or a lock. It's just a consequence of concurrent activity and the
order in which these activities occur. 

In reality, I don't normally use PLO Compare and Swap and store (any flavor)
for chains. They work fine for singly linked lists.  Usually I use it (just
wrote something today) to dynamically add an entry to a pre-allocated slot
in a cell. In this case, the cell has pre-allocated slots with a high water
mark pointer and a count, consecutive words in a double word. In this case,
the high water mark and count are the 2nd operand and the lock word. Since
the  2nd operand always updates last, any references would not even know of
the new addition until the PLO completes. 

I fall back to my original provision. The use of PLO is heavily dependent on
designing the application to use PLO. Just yesterday, I debugged a problem
in a task that was not doing a PLO Compare and Load to get a count.
Certainly, all references to the counts in a Compare and Swap Store should
all be updated with a PLO and probably require a PLO Compare and Load. 


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Tuesday, November 05, 2013 10:50 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Sorry for the confusion but that's not the question that I was asking. I
agree with you on guaranteeing the consistency using the count.

I'm talking about TCB1 using PLO CSDST to store 2 adjacent words (4th & 6th
PLO operands) and TCB2 using LM or LG for those same 2 words. There is a
very small window between the 2 store's where TCB2 will pick up inconsistent
inconsistent values. In other words, the first store has completed and the
LM/LG occurs before the second store completes. This window is extremely
small because PLO cannot be interrupted and the instruction was prepared
before performing the stores. 

I think the window is so small that even under heavy usage, you would only
see an error every couple of months but it does exist. I think TCB2 must
also use the PLO compare and load to avoid this situation.

Thanks for the great information, Jon Perryman.

From: Kenneth Wilkerson 
>
>
>The order of stores is unpredictable except that  according to the POM, 
>operand 2 (in this case, the count) is always stored last.
>

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson

The order of stores is unpredictable except that  according to the POM,
operand 2 (in this case, the count) is always stored last. 

" In those cases when a store is performed to the second- operand location
and one or more of the fourth-, sixth-, and eighth-operand locations, the
store to the second-operand location is always performed last, as observed
by other CPUs and by channel programs." 

Page 7-290 right column top half of page in SA22-7832-09, 7-281 in
SA22-7832-08

So it's impossible for the count to be updated before the stores. I've been
using and relying on these techniques for years with exhaustive testing
under high workloads with re-drive statistics to help me decide the
algorithm that I use. It can't  hurt to do the PLO Compare and Load. It just
adds overhead that is probably more efficiently handled by a re-drive. But
it's up to you. I suggest you add redrive counter to your test case and see
for yourself. I would be extremely surprised if the re-drive percent were
ever higher than a small fraction of 1% no matter how hard you drove the
chain.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Monday, November 04, 2013 3:31 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

As you say, PLO only locks CPU's using the same PLO lock word. For other
CPU's not using the lockword, it is consider multiple unique instructions.
So in the case of the 64 bit address, PLO CSDST, it is considered  compare,
store value1, store value2, store swap value. Although it's unlikely, it is
possible for the LG instruction to occur after store value1 but before store
value2. Or are the stores considered a single occurrance instruction to the
other CPU's?

Thanks, Jon Peryman. 



>____
> From: Kenneth Wilkerson 
>To: IBM-MAIN@LISTSERV.UA.EDU
>Sent: Monday, November 4, 2013 1:06 PM
>Subject: Re: Serialization without Enque
> 
>
>This is not correct. The choice to PLO compare and load is not required 
>since the count is always guaranteed to be swapped after the stores (my 
>last email). I only use PLO Compare and load for complex chain 
>manipulations. But do it if you want. The serialization performed by a 
>PLO forces serialization on the lock word for all processors. I try to 
>Avoid it for situations where a re-drive is less costly
>
>-Original Message-
>From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
>On Behalf Of Jon Perryman
>Sent: Monday, November 04, 2013 2:42 PM
>To: IBM-MAIN@LISTSERV.UA.EDU
>Subject: Re: Serialization without Enque
>
>Thanks for pointing out that it's required to do the PLO COMPARE 
>against the counter and FETCH of the value otherwise there is no 
>guarantee that value1 is consistent with the counter.
>
>I'm also hearing you say that programs that reference more than a 
>single word, must use PLO COMPARE and FETCH. In Kenneth's example where 
>he uses PLO to save 64 bit addresses (which is 2 words), he can't use 
>LG to reference the 64 bit address otherwise he risks using high and 
>low register values that do not match. Is that correct?
>
>Jon Perryman.
>
>
>
>>____
>> From: Binyamin Dissen 
>>
>>
>>That won't help if you fetch the new count and the old value1.
>>
>>On Mon, 4 Nov 2013 11:38:38 -0600 Kenneth Wilkerson 
>>
>>wrote:
>>
>>:>Yes, it is possible that the updates are not performed in any order.
>>:>However, it is guaranteed that the updates are only performed if the 
>>swap :>can be done. Therefore, I use a simple rule. If the number of 
>>instructions :>needed to compute the new chain pointers are small (as 
>>is the case in my :>example). I don't incur the overhead of doing the 
>>extra 2 PLO (Compare and
>>:>Load) operations. I simply re-drive the operation as shown in 
>>Binyamin's :>example. Even with the PLO Compare and Load, there is no 
>>guarantee the swap :>will succeed. It just lessens the likelihood. So 
>>the decision point is :>whether the overhead of 2 additional PLO 
>>instructions is less than the :>overhead of a re-drive. This can only 
>>be determined with testing. You can :>determine this by using a CS  to 
>>update a counter for every re-drive. You :>already have an operation 
>>count, so you can then easily determine the :>percentage of re-drives.
>>In my experience, even in very active chains, the :>PLO serialization 
>>process will incur a very small number of re-drives (much :>less than 
>>1
>percent).  But only testing can reveal that.
>>:>
>>:

Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson

This is not correct. The choice to PLO compare and load is not required
since the count is always guaranteed to be swapped after the stores (my last
email). I only use PLO Compare and load for complex chain manipulations. But
do it if you want. The serialization performed by a PLO forces serialization
on the lock word for all processors. I try to Avoid it for situations where
a re-drive is less costly

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jon Perryman
Sent: Monday, November 04, 2013 2:42 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

Thanks for pointing out that it's required to do the PLO COMPARE against the
counter and FETCH of the value otherwise there is no guarantee that value1
is consistent with the counter.

I'm also hearing you say that programs that reference more than a single
word, must use PLO COMPARE and FETCH. In Kenneth's example where he uses PLO
to save 64 bit addresses (which is 2 words), he can't use LG to reference
the 64 bit address otherwise he risks using high and low register values
that do not match. Is that correct?

Jon Perryman.



>
> From: Binyamin Dissen 
>
>
>That won't help if you fetch the new count and the old value1.
>
>On Mon, 4 Nov 2013 11:38:38 -0600 Kenneth Wilkerson 
>
>wrote:
>
>:>Yes, it is possible that the updates are not performed in any order.
>:>However, it is guaranteed that the updates are only performed if the 
>swap :>can be done. Therefore, I use a simple rule. If the number of 
>instructions :>needed to compute the new chain pointers are small (as 
>is the case in my :>example). I don't incur the overhead of doing the 
>extra 2 PLO (Compare and
>:>Load) operations. I simply re-drive the operation as shown in 
>Binyamin's :>example. Even with the PLO Compare and Load, there is no 
>guarantee the swap :>will succeed. It just lessens the likelihood. So 
>the decision point is :>whether the overhead of 2 additional PLO 
>instructions is less than the :>overhead of a re-drive. This can only 
>be determined with testing. You can :>determine this by using a CS  to 
>update a counter for every re-drive. You :>already have an operation 
>count, so you can then easily determine the :>percentage of re-drives. 
>In my experience, even in very active chains, the :>PLO serialization 
>process will incur a very small number of re-drives (much :>less than 1
percent).  But only testing can reveal that.
>:>
>:>-Original Message-
>:>From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] 
>On :>Behalf Of Binyamin Dissen
>:>Sent: Monday, November 04, 2013 11:15 AM
>:>To: IBM-MAIN@LISTSERV.UA.EDU
>:>Subject: Re: Serialization without Enque :> :>My understanding is 
>with multi-threading it is possible that the updates to :>the fields 
>may be out of order and thus it is possible to fetch the updated 
>:>counter with the unupdated value1. PLO serializes it.
>:>
>:>On Mon, 4 Nov 2013 07:46:51 -0800 Jon Perryman 
wrote:
>:>
>:>:>Thanks Binyamin. Also a great example but it brings me to another 
>:>question. What is the advantage of using PLO compare and fetch? Is it 
>just :>saving CPU time in the case where the counter has changed? Is 
>there another :>advantage that I'm not thinking about?
>:>:>
>:>:>Jon Perryman.
>:>:>
>:>:>
>:>:>
>:>:>>
>:>:>> From: Binyamin Dissen  :>> :>> :>> 
>:>>If you :>truly need a triple compare and swap then PLO will not help 
>you. But if :>:>>you need a disjoint double compare and swap, you use 
>the compare-and-swap :>:>>field as a counter and then you con do a compare
swap and double store.
>:>:>>
>:>:>>Example:
>:>:>>
>:>:>>     Fetch counter
>:>:>>A   PLO  compare-and-fetch value1
>:>:>>     CC>0, go to A
>:>:>>     PLO  compare-and-fetch value 2 :>:>>     CC>0, go to A :>:>>     
>calculate new value1 and 2 :>:>>     Add one to fetched counter :>:>>     
>PLO CSDST fetched-counter new-fetched-counter, new value1,
>:>new-value2 :>>     CC>0, go to A :>> :>> :>> :>
>:>:>---
>--- :>:>For IBM-MAIN subscribe / signoff / archive access instructions, 
>:>send :>email to lists...@listserv.ua.edu with the message: INFO 
>IBM-MAIN
>
>--
>Binyamin Dissen  
>http://www.dissensoftware.com
>
>Director, Dissen

Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson

I'm glad you brought that up because I knew what I have been doing for years
was correct but I hadn't taken the time to read the manual on PLO in some
time. The order of stores is unpredictable except that  according to the
POM, operand 2 (in this case, the count) is always stored last. 

" In those cases when a store is performed to the second-
operand location and one or more of the fourth-,
sixth-, and eighth-operand locations, the store to the
second-operand location is always performed last, as
observed by other CPUs and by channel programs." 

Page 7-290 right column top half of page in SA22-7832-09, 7-281 in
SA22-7832-08

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Binyamin Dissen
Sent: Monday, November 04, 2013 2:02 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

That won't help if you fetch the new count and the old value1.

On Mon, 4 Nov 2013 11:38:38 -0600 Kenneth Wilkerson 
wrote:

:>Yes, it is possible that the updates are not performed in any order.
:>However, it is guaranteed that the updates are only performed if the swap
:>can be done. Therefore, I use a simple rule. If the number of instructions
:>needed to compute the new chain pointers are small (as is the case in my
:>example). I don't incur the overhead of doing the extra 2 PLO (Compare and
:>Load) operations. I simply re-drive the operation as shown in Binyamin's
:>example. Even with the PLO Compare and Load, there is no guarantee the
swap :>will succeed. It just lessens the likelihood. So the decision point
is :>whether the overhead of 2 additional PLO instructions is less than the
:>overhead of a re-drive. This can only be determined with testing. You can
:>determine this by using a CS  to update a counter for every re-drive. You
:>already have an operation count, so you can then easily determine the
:>percentage of re-drives. In my experience, even in very active chains, the
:>PLO serialization process will incur a very small number of re-drives
(much :>less than 1 percent).  But only testing can reveal that. 
:>
:>-Original Message-
:>From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
:>Behalf Of Binyamin Dissen
:>Sent: Monday, November 04, 2013 11:15 AM
:>To: IBM-MAIN@LISTSERV.UA.EDU
:>Subject: Re: Serialization without Enque :> :>My understanding is with
multi-threading it is possible that the updates to :>the fields may be out
of order and thus it is possible to fetch the updated :>counter with the
unupdated value1. PLO serializes it.
:>
:>On Mon, 4 Nov 2013 07:46:51 -0800 Jon Perryman 
wrote:
:>
:>:>Thanks Binyamin. Also a great example but it brings me to another
:>question. What is the advantage of using PLO compare and fetch? Is it just
:>saving CPU time in the case where the counter has changed? Is there
another :>advantage that I'm not thinking about?
:>:>
:>:>Jon Perryman.
:>:>
:>:>
:>:>
:>:>>
:>:>> From: Binyamin Dissen  :>> :>> :>> :>>If
you :>truly need a triple compare and swap then PLO will not help you. But
if :>:>>you need a disjoint double compare and swap, you use the
compare-and-swap :>:>>field as a counter and then you con do a compare swap
and double store.
:>:>>
:>:>>Example:
:>:>>
:>:>>     Fetch counter
:>:>>A   PLO  compare-and-fetch value1
:>:>>     CC>0, go to A
:>:>>     PLO  compare-and-fetch value 2 :>:>>     CC>0, go to A :>:>>   
calculate new value1 and 2 :>:>>     Add one to fetched counter :>:>>   
PLO CSDST fetched-counter new-fetched-counter, new value1,
:>new-value2 :>>     CC>0, go to A :>> :>> :>> :>
:>:>--
:>:>For IBM-MAIN subscribe / signoff / archive access instructions, :>send
:>email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
Binyamin Dissen  http://www.dissensoftware.com

Director, Dissen Software, Bar & Grill - Israel


Should you use the mailblocks package and expect a response from me, you
should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems, especially
those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson

Yes, it is possible that the updates are not performed in any order.
However, it is guaranteed that the updates are only performed if the swap
can be done. Therefore, I use a simple rule. If the number of instructions
needed to compute the new chain pointers are small (as is the case in my
example). I don't incur the overhead of doing the extra 2 PLO (Compare and
Load) operations. I simply re-drive the operation as shown in Binyamin's
example. Even with the PLO Compare and Load, there is no guarantee the swap
will succeed. It just lessens the likelihood. So the decision point is
whether the overhead of 2 additional PLO instructions is less than the
overhead of a re-drive. This can only be determined with testing. You can
determine this by using a CS  to update a counter for every re-drive. You
already have an operation count, so you can then easily determine the
percentage of re-drives. In my experience, even in very active chains, the
PLO serialization process will incur a very small number of re-drives (much
less than 1 percent).  But only testing can reveal that. 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Binyamin Dissen
Sent: Monday, November 04, 2013 11:15 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

My understanding is with multi-threading it is possible that the updates to
the fields may be out of order and thus it is possible to fetch the updated
counter with the unupdated value1. PLO serializes it.

On Mon, 4 Nov 2013 07:46:51 -0800 Jon Perryman  wrote:

:>Thanks Binyamin. Also a great example but it brings me to another
question. What is the advantage of using PLO compare and fetch? Is it just
saving CPU time in the case where the counter has changed? Is there another
advantage that I'm not thinking about?
:>
:>Jon Perryman.
:>
:>
:>
:>>
:>> From: Binyamin Dissen  :>> :>> :>> :>>If you
truly need a triple compare and swap then PLO will not help you. But if
:>>you need a disjoint double compare and swap, you use the compare-and-swap
:>>field as a counter and then you con do a compare swap and double store.
:>>
:>>Example:
:>>
:>>     Fetch counter
:>>A   PLO  compare-and-fetch value1
:>>     CC>0, go to A
:>>     PLO  compare-and-fetch value 2
:>>     CC>0, go to A
:>>     calculate new value1 and 2
:>>     Add one to fetched counter
:>>     PLO CSDST fetched-counter new-fetched-counter, new value1,
new-value2 :>>     CC>0, go to A :>> :>> :>> :>
:>--
:>For IBM-MAIN subscribe / signoff / archive access instructions, :>send
email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
Binyamin Dissen  http://www.dissensoftware.com

Director, Dissen Software, Bar & Grill - Israel


Should you use the mailblocks package and expect a response from me, you
should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems, especially
those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Security exposure of zXXP was Re: zIIP simulation

2013-11-04 Thread Kenneth Wilkerson

Since an SRB can do a SCHEDIRB it can do whatever it likes. SRBs were
designed for authorized code to overcome restrictions. If you're authorized,
the gates open.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Binyamin Dissen
Sent: Monday, November 04, 2013 7:01 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Security exposure of zXXP was Re: zIIP simulation

On Sun, 3 Nov 2013 16:15:56 -0800 Jon Perryman  wrote:

:>I think Itschak is saying that SRB's can't do I/O, therefore they can't
write files to embed a virus or read confidential data. I think he's under
the impression that SRB's can't get access to everything they desire.

SRB's certainly can do I/O - they just need to do it at the metal level.

--
Binyamin Dissen  http://www.dissensoftware.com

Director, Dissen Software, Bar & Grill - Israel


Should you use the mailblocks package and expect a response from me, you
should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems, especially
those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Serialization without Enque

2013-11-04 Thread Kenneth Wilkerson

I have used PLO almost exclusively for serialization in multi-address space,
multi-du code for almost 10 years. I use all 6 operations. Since everything
I write is 64 bit mode, I generally use the +2 variant (64 bit length) but I
like using the +3 variant (128 bit length) for some really cool stuff. As
Rob pointed out, " The key to their use is to have a lock word counter that
the caller increments and then prepares the new values in other regs.". But
I add that the real key is designing the application to use PLO for
serialization which is much more than just writing macros to do the guts of
the processes (though I almost always use macros).

Consider this example. I have a singly linked chain of 64 bit cell addresses
on quad word boundaries. The chain pointers are a  quad word at the start of
the cell. The first double word is the head of the active chain. The second
double word is the head of the free chain. The first quad word of each cell
is a double word pointer to the next active entry followed by a double word
pointer to the next free entry. The chain has a quad word counter on a quad
word boundary. The first double word of the counter is a change count. The
second double word is an active element count. The algorithm always adds new
cells to the head of the list. 

I can add  a new cell by using a LMG to load the counters, increment each
counter by 1, compute the new head, compute the old head's new previous and
then use a Compare and swap and double store 128 bit to add the new entry.
Since every update increments the first double word counter by 1, the
process only completes if no other process updated the counter. If the
counter has changed, it needs to re-drive. By adding entries to the head, I
can also have code simultaneously searching the chain while entries are
added. Of course, if the new head is added before the search starts, it
won't be found. But that's no different than using a lock. If the search
acquires the lock before the add, it won't be found either.

I can even add an element that requires a search for one that has already
been added. In this case, I load the counters before the search. I search
the chain. If not found, I increment the counter and perform the add. If the
add fails, I have to re-drive the search.

I can also delete entries from the chain. When I find the entry to be
deleted, I save its previous entry. I can adjust the counts,  re-compute the
chain pointers and do a Compare and Swap and triple store to delete the
entry and add it to the fee chain. I can still search the chain but I'll
probably need to do a Compare and load to do so. I can avoid the PLO compare
and load by not actually deleting the cell but using the low half byte of
the active next pointer as a deleted flag. But that has disadvantages as
well. This also adds a little more logic to the add, since I now need to add
using the free chain if one exits or an add acquiring a new cell. 

There are a lot of details not given here for brevity. This example also
uses an unordered single linked list to simplify the example. But properly
designed PLO operations can be performed on ordered doubly linked lists as
well.  When I read the Principles of Operations on the Z/EC12 transactional
execution facility, I think strongly of a PLO on steroids. The point is that
PLO can almost be used exclusively for serialization. 

As far as overhead, I have done a lot of testing and the key is the proper
choice of the lock word and the algorithm. In my research, the throughput
advantages of PLO far outweigh its overhead. I would love some time with the
transactional execution facility. From my reading, it eliminates the need
for any serialization other than PLO or transactional execution. Though I
understand that IBM has chosen a redrive limit as the determining factor as
to whether to fall back to a lock. 

I believe the only limit to using PLO for serialization is the imagination.


Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Rob Scott
Sent: Monday, November 04, 2013 2:49 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Serialization without Enque

PLO CSDST and CSTST are *extremely* useful for queue and linked list
manipulation in multi-ASID multi-TCB environments. The key to their use is
to have a lock word counter that the caller increments and then prepares the
new values in other regs. When it comes time to actually atomically update
the lock word, you can redrive the structure manipulation logic if the CC
indicates that the lock word value has changed, otherwise the other fields
are updated atomically.

For actual practical uses, it is well worth putting all this inside some
sort of macro or small stub service as you do not want to have to code the
guts of it each time. 

I also think the uptake of PLO would be greater if there were some decent
example code in the manuals - for instance a client adding a request to the
tail of the queue whilst a

Re: Clarification of SAC7 Abend

2013-10-20 Thread Kenneth Wilkerson

Since you're using ALESERV EXTRACTH, I'm assuming you want to schedule and
SRB into the home address space. IEAMSCHD is expecting the address of the
STOKEN. So if you were to do this,

 LA   R2, SRBSTOKEN 
 IEAMSCHD EPADDR=SRBRTN@,   
  PRIORITY=LOCAL,  
  PARM=SRBPARM@,   
  ENV=STOKEN,  
  TARGETSTOKEN=(R2),   change this line 
  KEYVALUE=INVOKERKEY, 
  FEATURE=NONE,
  SYNCH=YES,   
  LLOCK=NO,
  RMTRADDR=RTMRTN@,
  FRRADDR=FRRRTN@

I'm also assuming the parms marked with an @ are the address of the actual
parm. If you want to schedule an SRB into any address space, provided you
know the ASCB address, you can acquire the STOKEN as follows:

*   R1 has ascb address of target address space for SRB
  L R1,ASCBASSB-ASCB(,R1)

  MVC   SRBSTOKEN,ASSBSTKN-ASSB(R1)

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of esst...@juno.com
Sent: Sunday, October 20, 2013 1:40 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Clarification of SAC7 Abend

Hi,

I need a better understanding of a SAC7 Abend when using IEAMSCHD to
schedual an SRB to another program that resides in another Address Space. 

I issue this macro from my schedualing address space:

IEAMSCHD EPADDR=SRBRTN@,   
  PRIORITY=LOCAL,  
  PARM=SRBPARM@,   
  ENV=STOKEN,  
  TARGETSTOKEN=SRBSTOKEN,  
  KEYVALUE=INVOKERKEY, 
  FEATURE=NONE,
  SYNCH=YES,   
  LLOCK=NO,
  RMTRADDR=RTMRTN@,
  FRRADDR=FRRRTN@ 



 DS  0DAlignment  
SRBWORK  EQU *.SRB Routine Work Areas   
SRBRTN@  DS  A.Address Of Target SRB Routine
RTMRTN@  DS  A.SRB Recovery Termination Routine 
FRRRTN@  DS  A.SRB Functional Recovery Routine  
SRBPARM@ DS  A.Parameters For SRB Routine   
SRBSTOKEN DS XL8  .Target/Home Address Space Token 
SRBRETCODE DS A   .Return Code From IEAMSCHED 
   DS A   .Reserved 
*  


I would like to have the above IEAMSCHD schedual an SRB to another Program
in another Address Space (the Target Address Space).

The Target Address Space previously made available the SRBWORK structure
(above) to the Scheduling Address Space: 

The Address Of SRB Routine SRBRTN@  was Previously loaded by the Target
Address Space.

TARGETSTOKEN SRBSTOKEN obtained by ALESERV EXTRACTH,STOKEN=SRBSTOKEN in the
Target Address Space.

The Recovery Routines (FMTADDR and FRRADDR) were previously loaded by the
Target Address Space and contain the Address Of the Recovery Routines.

When I issue the IEAMSCHED macro, and Abend SAC7 occurs without any dump.

I looked up  ABEND=SAC7 with REASON=00080001 in messages and codes. It
refers to an inappropriate Space Token.

So did I incorrectly obtain the Space Token Of The Traget Address Space ?
I used the ALESERV EXTRACTH,STOKEN=SRBSTOKEN, I thought that was correct.

Can anyone point me in the right direction to resolve this.


Paul D'Angelo
*

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: FRR Recovery Routine Environment

2013-10-03 Thread Kenneth Wilkerson

>Not a big fan of EUT FRRs 

Your right. I prefer to examine my environment and chose to use an ESTAEX 
instead of a FRR unless required. The FRR stack is limited to 2 uses. So when 
forced to use an FRR, I extend the FRR stack by replacing the prior FRR and 
restoring it upon exit. Using an EUT=YES FRR creates restraints for code that 
doesn't need them. If you need the FRR, you have the restraints anyway. But if 
you don't need it and use an EUT=YES FRR, you just placed a whole bunch of 
restrictions. And I never use SVCs in any PC code. 

>I know that R14 points to the EXIT service call
Your right again. This is an oversimplification that I didn't bother to 
elaborate.  

>It’s a real pain when you want to retry into 64 bit addresses
However, I support releases prior to 1.13 so I choose a method that works to 
the lowest common denominator. I retry to a 31 bit stub that loads the 64 bit 
address and does a BSM.

Your point is well taken. I should refrain from expressing opinions in this 
forum.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Peter Relson
Sent: Thursday, October 03, 2013 7:14 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: FRR Recovery Routine Environment

>Not a big fan of EUT FRRs.

That's a curious statement. As with most things, EUT FRRs solve a problem. 
If you have the problem, you need them. If you don't have the problem, you 
don't need them. It seems that it is really not a question of being a fan or 
not. Maybe you're not a fan of using FRRs in environments where you could use 
an ESTAEX for such reasons as you want to be able to issue an SVC from your 
recovery routine or you do not want your recovery to get control with locks 
obtained by the mainline in such cases.

>I know that R14 points to the EXIT service call
That is not true for FRRs. 

>It’s a real pain when you want to retry into 64 bit addresses
Given that you have gone to the real pain of using RMODE 64 apparently for many 
cases that are not supported, I'm surprised to see this statement. 
You can set your 64-bit address with bit 63 on into the register 15 retry slot 
of the SDWA (SDWAG6415), use SETRP with RETRY=64,RETRY15=YES , and identify a 
retry address of CVTBSM0F.  I'd call that a relatively minor inconvenience; 
perhaps that's a real pain to you. This has been available since z/OS 1.13 
which is the first release where RMODE 64 of enabled code was tolerated.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: FRR Recovery Routine Environment

2013-10-02 Thread Kenneth Wilkerson

I use FRRs a lot because just about every PC routine I write can be called from 
an SRB. The PC routine defines a ESTAEX when called in task mode and examines 
the FRR stack and adds or replaces an FRR when called in an SRB routine or when 
a lock is set. I rarely use EUT=YES FRRS. Not a big fan of EUT FRRs. I know 
that the documentation states the R2 points to the 24 byte parm area but I've 
always retrieved it from the SDWAPARM field. In your dump R2 seems to point to 
the FRR extension. And yes, the parm address is in PSA (BE) looks right for the 
first entry in the stack). The FRR stack has to be specially managed by MVS and 
saved and restored each time a DU is dispatched. There's a lot of details 
missing from your email so I'm just going to give you the working linkage to a 
well tested FRR routine. First of all, I don't save the registers. I know that 
R14 points to the  EXIT service call so I don't worry about the registers. 
Second,  I always set the first 8 bytes of the 24 byte parm area to an 
eyecatcher. I always do that last in case something goes wrong before the FRR 
is fully setup. Beleive me, things can go wrong.  Lastly, my FRR entry point is 
just a stub. After dealing with the FRR dependencies, it just jumps into the 
ESTAEX recovery routine with an entered from FRR flag set. The way I have 
everything setup, the processing of an abend doesn't matter once the starting 
linkage is addressed.

One other thing, most of my routines are RMODE64. Therefore, the FRR exit is 
loaded into 31 bit storage. So most of my code uses the 64 bit instruction set 
and a 31 bit SDWA. It’s a real pain when you want to retry into 64 bit 
addresses  but its well tested.

*  
*  FRR ENTRY TO HANDLE RELEASING LOCK. IT TRANSFERS CONTROL TO ABEND
*  RECOVERY ABOVE TO RECORD ERROR   
TDFPCLNK_FRRDS0H
LARL  R12,TDFPCLNK_ESTAEX  USE ESTAEX ADDRESSABILITY
 
USINGTDFPCLNK_ESTAEX,R12   
USINGSDWA,R1   
LLGT   R8,SDWAPARM   FRR Parmad address 
   
LLGTR R7,R0   SAVE FOR JUMP INTO ABEND RECOVERY 
LLGFR R5,R14  SAVE FOR JUMP INTO ABEND RECOVERY 
LLGTR R6,R1   SAVE FOR JUMP INTO ABEND RECOVERY 
CLC   0(8,R8),=CL8'eyecatch'  GOT MY EYECATCHER?
JETDFRECV_GOTPARMSYES - LETS RECORD 
**  
*  WITHOUT PARM AREA, WE CAN STILL SET ABEND CODES FOR PERCOLATION  
LLGTRR1,R6   RESET TO RETURN W/FAILURE 
LARL   R2,retry_address_if_no_parms 
   
SETRPRC=4,REMREC=YES,RETADDR=(R2)  
LLGTRR14,R5  RESET TO RETURN W/FAILURE 
BR   R14
   
**  
*  RESET MY PARM REGISTERS  
TDFRECV_GOTPARMS DS 0H  

   Load parms from R8
 And jump into ESTAEX RTM exit with called from FRR flag set
   

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Donald Likens
Sent: Wednesday, October 02, 2013 3:47 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: FRR Recovery Routine Environment

I do not understand what I am doing wrong. I setup the following recovery 
environment:

SETFRR A,FRRAD=FRRA,EUT=YES,MODE=FULLXM,WRKREGS=(R1,R2),
  PARMAD=(R3)
STR13,0(R3)

I then caused my program to abend.

When my recovery routine is entered I ABEND with an S0C4 because R2 is not as I 
expected. The following is the start of my FRR recovery routine:

013723*C START FRR RECOVERY ROUTINE  
013724FRR  DS0H  
013725* ON ENTRY 
013726*   R15 ADDRESS OF THIS ROUTINE
013727*   R14 RETURN ADDRESS 
013728*   R1  ADDRESS OF SDWA
013729*   R2  ADDRESS OF PARAMETERS  
013730 STM   R0,R15,0(R13)   
013731 USING FRR,R15 
013733 L R3,0(R2)
013734 USING WKGSTG,R3   
013735 LRR12,R15 
013736 DROP  R15 
013737 USING FRR,R12 

I thought perhaps R2 would be the address of the parameter list but the

Re: Memory For MSTJCL00 - Whose Is It?

2013-09-16 Thread Kenneth Wilkerson

If I were diagnosing this problem, I would take a console dump of ASID 1. If
a resource manager is the culprit and it uses an eye catcher, I would expect
to see a bunch of storage with that eye catcher.  I was simply suggesting
that an ASCB resource manager is a good possibility. Resource managers can
be dynamically defined by the RESMGR service or statically defined at IPL.
Chapter 18 in the MVS Programming: Authorized Assembler Service Guide  has a
section on resource managers including those statically defined at IPL time.
I would look at the statically defined resource managers first particularly
any that are defined to execute after the termination of all address spaces.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Martin Packer
Sent: Monday, September 16, 2013 8:46 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Memory For MSTJCL00 - Whose Is It?

Thank you. Care to name a few - not to point the finger, but so I have some
idea what they are?

Cheers, Martin

Martin Packer,
zChampion, Principal Systems Investigator, Worldwide Banking Center of
Excellence, IBM

+44-7802-245-584

email: martin_pac...@uk.ibm.com

Twitter / Facebook IDs: MartinPacker
Blog: 
https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker



From:   Kenneth Wilkerson 
To: IBM-MAIN@listserv.ua.edu, 
Date:   09/16/2013 02:38 PM
Subject:Re: Memory For MSTJCL00 - Whose Is It?
Sent by:IBM Mainframe Discussion List 



Address space resource managers execute in asid 1, *MASTER*. Unless they
issue a message, you would never know they executed. If an ASCB RESMGR were
not cleaning up after itself, it would account for accumulations.

Kenneth 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Staller, Allan
Sent: Monday, September 16, 2013 8:01 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Memory For MSTJCL00 - Whose Is It?

General thoughts with no hard data behind them. I.E. SWAGs

1) MSTJCL00 (i.e. *MASTER*) has been flagged by WLM as Storage Critical.
Check w/WLM development.
2) Turn on the VSM* parameters in SYS1.PARMLIB(DIAG*) for data gathering.
View w/IPCS and/or RMF.
 Not sure what (if any) RSM* parameters are available.

I believe your theory about MSTJCL00 being used as an anchor is reasonable,
however, 2 GB of anchors seems excessive, even in a very large system. 
I do not believe this is a backing for anything that does not belong to a
"system address space".

FWIW,



Almost a year ago in
https://www.ibm.com/developerworks/community/blogs/MartinPacker/entry/bad_da

ta_and_the_subjunctive_mood?lang=en
I talked about seeing large numbers for memory in MSTJCL00.

At the time I got no takers as to what it could be. So I'm trying again
here...

... Is MSTJCL00 the anchor for something? Common Large Memory Objects
perhaps?

And are YOU seeing large memory numbers?


--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN








Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU






--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Memory For MSTJCL00 - Whose Is It?

2013-09-16 Thread Kenneth Wilkerson

Address space resource managers execute in asid 1, *MASTER*. Unless they
issue a message, you would never know they executed. If an ASCB RESMGR were
not cleaning up after itself, it would account for accumulations.

Kenneth   

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Staller, Allan
Sent: Monday, September 16, 2013 8:01 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Memory For MSTJCL00 - Whose Is It?

General thoughts with no hard data behind them. I.E. SWAGs

1) MSTJCL00 (i.e. *MASTER*) has been flagged by WLM as Storage Critical.
Check w/WLM development.
2) Turn on the VSM* parameters in SYS1.PARMLIB(DIAG*) for data gathering.
View w/IPCS and/or RMF.
 Not sure what (if any) RSM* parameters are available.

I believe your theory about MSTJCL00 being used as an anchor is reasonable,
however, 2 GB of anchors seems excessive, even in a very large system. 
I do not believe this is a backing for anything that does not belong to a
"system address space".

FWIW,



Almost a year ago in
https://www.ibm.com/developerworks/community/blogs/MartinPacker/entry/bad_da
ta_and_the_subjunctive_mood?lang=en
I talked about seeing large numbers for memory in MSTJCL00.

At the time I got no takers as to what it could be. So I'm trying again
here...

... Is MSTJCL00 the anchor for something? Common Large Memory Objects
perhaps?

And are YOU seeing large memory numbers?


--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: "NSA foils much internet encryption"

2013-09-05 Thread Kenneth Wilkerson

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of John Gilmore
Sent: Thursday, September 05, 2013 2:43 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: "NSA foils much internet encryption"

More Snowden documents have been reviewed by the New York Times, which this
afternoon concluded that

The agency has circumvented or cracked much of the encryption, or digital
scrambling, that guards global commerce and banking systems, protects
sensitive data like trade secrets and medical records, and automatically
secures the e-mails, Web searches, Internet chats and phone calls of
Americans and others around the world, the documents show.

This is not very different from the standard informed conjectures about what
the NSA and its counterparts elsewhere can do.  It is important that the
readers of airline magazines disabuse themselves of the notion that they can
keep secrets from these agencies using off-the-shelf technology.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Questions about ESTAE(X)

2013-08-28 Thread Kenneth Wilkerson

I rarely use TERM=YES. I use RTM exits almost exclusively for error
reporting and setting a retry. About the only time TERM=YES is used is in
the primary driver task for a cross memory server so that its RTM exit can
reset the PC services available flag to minimize D6 abends. But I don't even
rely on that. I just code the calling interfaces to treat and report D6
abends as unexpected server terminations. I rely on the RTM to cleanup
address space and task level resources. The real issue is common resources
that are shared system wide or between 2 or more address spaces. For this, I
prefer resource managers, particularly address space resource managers. And
I know its authorized and probably only necessary for cross memory servers
which by definition must be authorized. This may seem off topic but the
topic is about cleaning up after a termination event (TERM=YES) such as a
CANCEL. 

Address space resource managers are guaranteed to execute, even if an
address space is forced. Consider an address space that has terminated
because it has exhausted its memory. If you have anything but the simplest
tasks to perform, there may not be storage to do cleanup. My experience is
that in serious error conditions, the RTM exit may not be the best way to
cleanup common resources. The only advice I have is that if you define an
address space resource manager, be sure you have a timer exit to time it out
should it go in a loop. This probably will never get used but a loop in an
address space resource manager, which runs in the master address space is a
non-trivial problem.

Do with this information as you wish. But if you are considering TERM=YES
for anything but the simplest resource cleanup, consider a resource manager.
I rarely use TERM=YES. I prefer resource managers. And I only use resource
managers to cleanup common resources. Most of the stuff I write uses
neither.

Kenneth 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Wednesday, August 28, 2013 8:21 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Questions about ESTAE(X)

A recovery routine cannot change SDWACLUP (or most of the fields in the
SDWA) and have such a change be useful to anything. If you are intended to
change it, usually SETRP will let you do so, or it's a field relevant to
retry or it's one of the "communication" fields. SDWACLUP is on not only for
TERM=YES but also for all other retryable abends.

>if I don't issue any ESTAE(X), then
>*something* gets control on "normal" ABENDs
That's called RTM, regardless of the type of abend.

>Am I correct that you apparently can't issue an ABEND macro 
>(effectively)
in
>a recovery routine?
I would so "no" but it depends what you mean by effectively. 

Once termination begins (think cancel) an ESTAE(X) without TERM=YES will not
get control, but an ESTAE(X) with TERM=YES will. I think that applies to
nested recovery too (a nested recovery routine is a recovery routine set
within the ESTAE(X) routine itself). 

For TERM=YES, normal rules of nested recovery, percolation, and even retry
apply (a nested recovery routine can retry back to the recovery routine that
created it; it cannot retry back to the mainline).

>if an ESTAE(X) TERM=YES is chained after an ESTAE(X) TERM=NO, is there 
>any way to get the chained recovery routine to percolate a 
>TERM=YES-type ABEND?

I must not be understanding. The "chained recovery routine" in the sentence
above appears to be the TERM=YES routine. It can of course "percolate" a
"TERM=YES-type ABEND" (in fact it has no choice but to percolate). But when
it percolates, the TERM=NO routine will not get control, specifically
because it is TERM=NO and this was a "TERM=YES-type ABEND".

So overall, I really don't know what is confusing. The basic point is "if
you have nothing to clean up if the job is going to terminate due to the
error, then you usually do not need TERM=YES". For example, if you might
ordinarily freemain something, but if the system will do so upon job
termination (as it will, in effect, do for region subpools) then you might
choose not to worry about getting control in recovery for that termination
case to do the freemain.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Hints needed on abend 0D6-027

2013-08-23 Thread Kenneth Wilkerson

Program calls (PC) is a Z/Architecture feature that has a z/OS server, PCAUTH, 
ASID 2 to administer it. The ETCON, ETDES, ETDIS, and ETDEF macros are the 
primary interfaces into that server. It's the first PC numbers defined in the 
system during IPL in the range of 0 to x.  Chapter 5, Program Execution, 
Subchapter PC Number Translation in Principles of Operations has an excellent 
description of how PC numbers are defined. And you are right, PCAUTH maintains 
these system level control blocks (actually hardware defined) in the PCAUTH 31 
bit LSQA (has to be fixed real storage). The hardware provides the ability to 
disable any lookup in these tables by setting the high order bit to 1 in any of 
these tables. When an address space terminates or it issues an ETDIS or ETDES 
for the LX, the PCAUTH server is called and it disables any references to the 
LX  This results in a LSX-Translation exception, X'0027'  which is translated 
to a 0D6-027.

This is a very simplified explanation of what is in the POM. So why are you 
getting a 0D6-27? Because the PC defining address space has terminated or 
disconnected the LX.  When the  PC defining address space is terminating or 
just before issuing its LX disconnect, it could indicate so by setting some bit 
in a commonly addressable control block. This bit could then be turned off when 
reinitializes and  it reconnects the LX. This greatly reduces the likelihood of 
the D6-027 but it is impossible to completely eliminate it entirely (though the 
probability is astronomically small). No matter how close to the call that you 
place the test, the caller could get suspended and the address space could 
terminate still resulting in a D6-027. So the code should also recognize a 
D6-027 as an indication that the served has terminated.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Robin Atwood
Sent: Friday, August 23, 2013 6:47 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Hints needed on abend 0D6-027

Our application very occasionally (once every few months) abends 0D6-027, which 
means a PC instruction has caused a "Linkage second index translation 
exception". I am wondering exactly what this is telling me since the auxiliary 
ASID being PCed to had been active for some time and had processed several 
requests, ie, the PC had been working perfectly well up until the abend. So 
what has gone bad? My understanding is that the PC linkage information is kept 
in system control blocks so that should be OK. The PC number we are calling is 
X'00017F00' which implies an ET index of X'F00'; is that sane? Any hints 
gratefully received!

Robin

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: ECTG usage

2013-07-25 Thread Kenneth Wilkerson

The management of the CPU timer is completely in the realm of the
dispatcher/scheduler. Therefore, using ECTG when you're not in an disabled
state during the entire timing process will not produce the results you
want. I have always used TIMEUSED to get CPU time.  It's been many years
since I've had a need for TIMEUSED and it has certainly changed. It appears
ECTG was written to improve its performance. 

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Richard Verville
Sent: Thursday, July 25, 2013 8:47 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: ECTG usage

I'm trying to benchmark cputime (under CICS) with pieces of code I'm
changing, ECTG before and ECTG after. I'm zeroing out operand 1 before the
ECTG thus I get a negative value in GPR0(because ETCG subtracts the operand
1 with the timer value) after I'm doing a LCR of GPR0 to get the positive
timer value. If the cputimer went negative during the test (timer
interrupt), the second ECTG is higher than the 1st one and since I don't
know the "refeed" value of the CPUTIMER, I can't tell how much cputime was
spend. I know I could use CICS internal values or statistics) but since they
made ECTG as non-privilege I figured I'd give it a try. So... I'm missing
something in the concept (the refeed value and how many times the interrupt
occured ?) Richard   

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Is there a "reverse bits" hardware instruction?

2013-07-24 Thread Kenneth Wilkerson

The macro at the end of my reply will generate a reverse translate table. I
tested it enough to see that it looked right but it has nothing to do with
my original point. I've found this discussion interesting and it has given
me reason to play with something other than the complex process I'm working
on now. My original point, as was taken by Charles, was that I prefer
automatic processes to generate anything but the simplest translate tables.
I prefer to do it in a program and copy it from a dump into a program
because the table is static. I don't need to generate it every time I
assemble.

In regards to TROO, the original problem was stated as translating a 64 bit
register. A STG, TR for 8 bytes, followed by a LRVG will more than suffice.
Though, I find your argument about the use of a TRTRE to simulate a FROGR
interesting.  It brings the whole bit reversal into perspective. I'm not a
big fan of translate instructions (I use them often enough), particularly
those that require facilities and facility enhancements,  unless you're
translating "long" strings and are willing to test the facility bits. I'm
sure that the translation and parsing facilities exist on most customer's
boxes by now but I emphasize "most".  When I awakened this morning, I wrote
an algorithm to do a load reverse bytes and bits using FLOGR to drive the
process. I'm going to give the idea of a FROGR simulation more thought and
continue this exercise later.

MACRO
&LABEL  REVTABLE ,
* Construct reverse bits translate table
LCLA &I,&J,&K,&L,&M,&N,&O
LCLC &X
&LABEL  DS   0DLIKE EM DOUBLE WORD ALIGNED
&I  SETA 0 STARTING VALUE
.TABLOOP ANOP  LOOP UNTIL TABLE IS DONE
&K  SETA 1 NEED SIXTEEN ENTRIES PER LINE
&X  SETC   'AL1('
AGO.X16LP
.X16NXT ANOP
&X  SETC   '&X'.'&J'.','
.X16LP  ANOP   16 ENTRY LOOP
&J  SETA 0 STARTING RESULT
&L  SETA 1 STARTING ADDEND
&M  SETA 1 8 BITS PER BYTE
&N  SETA 128   STARTING COMPARAND X'80'
&O  SETA &ICOPY CURRENT BYTE TO REVERSE
.BYTELP ANOP
AIF  (&O LT &N).BYTEFT LESS THAN CURRENT - 0
&O  SETA &O-&N
&J  SETA &J+&L
.BYTEFT ANOP
&L  SETA &L*2   NEXT ADDEND
&N  SETA &N/2   NEXT COMPARAND
&M  SETA &M+1   NEED EIGHT BITS
AIF  (&M LE 8).BYTELP
&I  SETA &I+1
&K  SETA &K+1
AIF  (&K LE 16).X16NXT
&X  SETC '&X'.'&J'.')'
DC   &X
AIF  (&I LT 256).TABLOOP
MEND


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of John Gilmore
Sent: Wednesday, July 24, 2013 10:29 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Is there a "reverse bits" hardware instruction?

The construction of arbitrary translation tables can be error-prone, and
when it is it is better done procedurally.  I use the HLASM macro language,
which is entirely adequate to such tasks; mais à chacun son goût.

Here, however, we have for a TROO only the 256 permutations taken two at a
time of the sixteen hexadecimal digits, viz.,

0==>0, 1==>8, 2==>4, 3==>c,
4==>2, 5==>a, 6==>6, 7==>e,
8==>1, 9==>9, a==>5, b==>d,
c==>3, d==>b, e==>7, f==>f

and they can be enumerated readily, by program or manually (and certainly
without resort to cut-and-paste from a dump).

Symmetries can also be exploited, and the whole thing can be arithmetized,
but to do either would put mathematics dropouts at a disadvantage.

The problem CAN be addressed with left circular shifts/left rotations, but
they must be nested (and iterated for long bit strings).  The TROO turns out
to be faster, particularly for those long bit strings.

The problem of bit-string reversal has its own interest, but if its purpose
is in effect to simulate a FROGR using a FLOGR, then other approaches are
possible.

Specifically, a TRTRE, Translate and Test Reverse Extended, can be used.  It
proceeds from right to left, high to low storage address, in a byte string
only until it finds a non-zero value in its table that corresponds to the
current byte's rank.

Permutations taken two at a time of the hexadecimal digit-codes

==>, 0
0001==>0001, 1
0010==>0010, 2
0011==>0001, 1
0100==>0011, 3
0101==>0001, 1
0110==>0010, 2
0111==>0001, 1
1000==>0100, 4
1001==>0001, 1
1010==>0010, 2
1011==>0001, 1
1100==>0011, 3
1101==>0001, 1
1110==>0010, 2
==>0001, 1

in which zero indicates the absence of a one bit and a non-zero value
indicates both the presence of a one bit and its one-origin offset from the
rightmost bit position.  The permutations/code points

x'N0',  N = 1, 2, . . . , f

need 'special' treatment, 8 must be added to the values shown above for
them.

Unsurprisingly, this turns out to be faster than reversal followed by FLOG.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INF

Re: Is there a "reverse bits" hardware instruction?

2013-07-24 Thread Kenneth Wilkerson

I can't imagine any instruction sequence in any language performing a "Load
Reversed with Mirrored Bytes" more efficiently in the Z/Architecture than a
STG, TR for eight bytes and LRVG. Even though, the TR is probably
micro-coded (I don't know about the LRVG), I can't see any loop that shifts
and manipulates the data and repeats up to 63 times (assuming a very dense
register) could outperform this. I wrote an algorithm using a FLOGR but
except in the best cases (all 0s or many leading 0s), I can't imagine this
running faster.  And with negative numbers (-1 being the worst case),  you
would probably want to exclusive or with foxes before and after the
operation to make the value  more sparse. 

However, in your initial post you talked about the above sequence involving
the TR being complex. I assume you're talking about the translate table
itself. When I need translate tables that are not "simple" and particularly
error prone, I write a program to create it. I would quadword align the
origin and result tables, do the tests and sets (in this case X'80' to
'X01', ... X'01' to X'80'), load the address of the result table in a
register, DC H'0' to get an 0c1. I would set a slip and run the job. I could
then format the dump and cut and paste (with a little manipulation) the
table into an assembler source. In this case, if the first and last 16 bytes
of the table are correct, the its probably 100% correct.  I find the half
hour I use doing this for "error prone" translate tables can save me hours
debugging later. 

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Charles Mills
Sent: Wednesday, July 24, 2013 7:31 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Is there a "reverse bits" hardware instruction?

Thanks all.

You're right, "just how fast DOES this code need to be?" And the answer is I
should know, but I don't. I don't want to waste the customer's cycles. I am
smart enough to know that I am too dumb to know how fast it needs to be. The
right answer lies in profiling, and some other task has always been just a
little higher priority than profiling.

Thanks! Great link! The De Bruijn thing is amazing. I was a math minor but I
hated it. I am very weak on the higher math relevant to programming.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Andrew Rowley
Sent: Wednesday, July 24, 2013 8:17 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Is there a "reverse bits" hardware instruction?

How fast does this code need to be? David's ffs64 looked pretty good to my
inexpert eye, I think you would have to be running it very frequently for
something to be measurably faster.

There are some similar discussions here, including some branchless
techniques that probably would be faster (not necessarily detectably):
http://stackoverflow.com/questions/757059/position-of-least-significant-bit-
that-is-set

One answer also talks about clearing the lowest set bit.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dynamic LPA Services

2013-07-11 Thread Kenneth Wilkerson

I relocate all non-PC code into 31 bit storage. To the code being called, it 
appears as if it's RMODE31. I do call PC routines above the bar, but it would 
be trivial to relocate them as well if it became necessary. I trust the 
Z/Architecture to handle the PC linkage. 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Walt Farrell
Sent: Thursday, July 11, 2013 9:06 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services

On Thu, 11 Jul 2013 09:03:33 -0400, John Gilmore  wrote:

>As I read Kenneth Wilkerson's post  he is arranging things so that an
>RMODE(64) routine that needs system services unavailable to it as such 
>arranges to have a different, associated RMODE(31) routine request them 
>and make their results available to it.

As I read it, that's not what Kenneth said, John. He said that he has an 
RMODE(31) stub that he uses as the address -of- the PC routine, and the stub 
then invokes the actual PC code that is RMODE(64). He specifically talked about 
using the system services from his RMODE(64) code, and if that's true then it's 
unsupported, as Peter mentioned.

>
>This scheme [or the alternative one in which an RMODE(31) routine hands 
>off functions to, or accesses data in, an RMODE(64) one] is entirely 
>viable and much used in IBM code.

Are you perhaps confusing AMODE and RMODE, John? As far as I know, IBM does not 
make much use of RMODE(64) code. I believe the capability of RMODE(64) code was 
provided for DB2's use, and I suspect only DB2 is using it for much (though I 
have no real way of knowing, any more.) It is true that RMODE(31) routines are 
used often to handle AMODE(64) callers, of course.

>In my own programming I
>now often use AMODE(64) code in RMODE(31) routines to facilitate just 
>such exchanges.

Right: AMODE(64) in RMODE(31) is just fine. But RMODE(64) code is rare, I 
believe, and has the restrictions that Peter mentioned.

RMODE(64) support for code is documented to be for code that does not call 
system services. While system services may be documented to allow AMODE(64) 
callers, that does not mean that they will properly handle RMODE(64) callers.

I presume IBM knows  (or suspects) that some issues exist if RMODE(64) code 
were to call system services or they would not have made that restriction. But 
I suppose it is also possible that they are simply being cautious and  avoiding 
a heavy testing and warranty expense by stating that restriction.

--
Walt

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dynamic LPA Services

2013-07-11 Thread Kenneth Wilkerson

I'm basing everything  I'm doing on the Principles of Operations manual
(POM). The Z/Architecture is the final authority for any program even MVS.
The linkage for PC instructions are handled by the linkage stack which fully
supports 128 bit PSW and 64 bit  registers. If it didn't, nothing that I'm
doing would work. I've had STORAGE abends (B78 etc) and they get
communicated to my RTM exit and the exit handles the 64 bit retry by
retrying into a 31 bit stub that BSMs to the actual retry.  It's important
that I mention that even in PC calls, I insure that all parameters are below
the bar. And if a problem ever did occur, I would simply start relocating PC
calls below the bar as well. This methodology works for BLDL so I can't see
why it wouldn't work for STORAGE. I've been running RMODE64 since shortly
after 1.13 became available without incident related to RMODE64 except
program bugs. All of this probably works because the RMODe64 programs define
a 31 bit RTM exit and 31 bit retries.

I understand your objection to LOAD with ADDR=. The fact that the code is
not identified can be problematic. But the nucleus solves this problem by
using load tables (CVT, SVT, SFT, etc) and provides a service, NUCLKUP. I
have simply adopted that methodology; a load table with a RESOLVE command
that is also PC intelligent. It's not difficult to extract the LTOR and
resolve the entry tables. The POM describes the architecture very
thoroughly.

I actually don't use ADDR64=. During server initialization, I have to load
the code to verify whether or not the code has changed. Since all of the
code is self-relocating (no ESDs or RLDs), if it has changed, I simply MVCL
it into the common memory object.  At the end of initialization, I also DAT
protect the code area so that essentially, except for the fact that it can't
be identified, it's just like an LPA above the bar. If there were a way to
identify RMODE64 code, I would use it. 

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Thursday, July 11, 2013 6:28 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services

RMODE64: Perhaps you are not aware that z/OS provides support for RMODE 64
routines only when they call no system services.
If that was your case, then great. But apparently it isn't since you
mentioned STORAGE and ESTAEX and CSVQUERY which certainly do not document
that they support RMODE 64 invocation. You are taking a risk. Is it worth
it?

Presumably you are using LOAD with ADDR64 rather than LOAD with ADDR. 
Perhaps I misread your original post, but I thought it said LOAD with ADDR.

I still fully stand by the statement that lOAD with ADDR= to common storage
should not be used for programs any longer. LOAD with ADDR64, it is true,
has no dynamic LPA equivalent so to the extent you have a routine that
properly qualifies, there can be benefit.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dynamic LPA Services

2013-07-09 Thread Kenneth Wilkerson

>since most of the stuff I write is RMODE64.
>Really? Perhaps you meant AMODE 64. But I'm not sure what that has to do
with PC routines.

And it has a lot to do with PC routines sine the LPA is 24/31 bit storage.
If you want to exploit RMODE64, you can't currently do that in the LPA. PC
routines can be called from any amode, any Rmode and any environment (the
new transaction environment excluded) except SACF 256 and 768. That is why
PC routines are important. 

More than 2/3rds of the server (approximately 500K of code) executes in a
common memory object. The code that does not is mostly involved in SVC
screening, runs as IRBs, run as tasks in the server  or are specialized
services that make a lot of non-PC  MVS service calls. Since most of the
server including the API are designed to run in cross memory mode, there are
no SVC calls. Since PC calls such as STORAGE, ESTAEX and CSVQUERY (which is
all the services normally used by the server) are RMODE64 capable this
presents no problem. Some of the API services branch call MVS services and
even do I/O. Since all the code is ARCHLVL=2 and is self-relocating, I copy
the guts (macro expansions) of these calls into a 31 bit work area enclosed
in a BSM back to the RMODE64 code.  My RTM exit recognizes abends in these
relocated copies and report them accordingly. The 2 big issues were RTM
exits and getting the PCAUTH to accept my 64 bit addresses. I got around
these issues through a concept I call surrogation. I create a 31 bit stub
program that contains the entry points to all the PC calls and their RTM
exit (they all share the same RTM estaex or FRR exits). This code handles
the redirection into the memory object. I could not find methods provided by
MVS to do these things (I did not spend a whole lot of time searching). So I
designed and wrote my own methods.  

I designed the server to run in RMODE64 from day 1 so when 1.13 was
released, I was able (through a few macros and the surrogate program) to get
many of the server programs above the bar in a single day. With time, I've
moved most of the server including much of the UI above the bar. I even
execute ISPF calls above the bar by replacing the CALL macro with a
self-written macro.

Again, my point is that I don't believe in designing servers to the lowest
common denominator provided you are willing to write the code to fill in the
gaps.

Kenneth
-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Tuesday, July 09, 2013 6:42 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services

>You know who owns it because its defined as a PC and therefore has an 
>entry table assigned to it.
I suspect that every diagnostician in the world disagrees with you about
using LOAD with ADDR=. It is true of course that you could navigate from the
PC number to the entry table to the target address for the PC. But then you
want to know what module is at that target address. Having a "name" that has
been provided by the module owner (presumably one that follows the module
owner's naming
conventions)
makes that easiest.  The same is true if you blow up at address "x" and want
to find out in what module "x" is. Using dynamic LPA for things in common
makes that easier. And has no significant downside.

>since most of the stuff I write is RMODE64.
Really? Perhaps you meant AMODE 64. But I'm not sure what that has to do
with PC routines.

>MVS is going to treat it as authorized simply because it's in the LPA. 
That means that it is accepted as the target of a LINK, LOAD, (etc) from an
authorized requestor. It does not mean that it will get control in an
authorized state from EXEC PGM=.  That requires AC=1.

>To say that you can't ever free a PC routine is untrue. Almost any 
>space switching PC will terminate as soon as the server that defined it 
>terminates.
I carelessly omitted, but the thread had already established, that we were 

talking about non-space-switch PC's. 

>Certainly, any PC routine that is defined as non-space switching  
>system PC routine that can be called without any provided interface 
>probably cannot be freed.
The only such "interface" that I can think of is one that increments a
counter (or sets a flag if that suffices) before issuing the PC and freeing
of the area is not allowed if the counter is non-0. Such counters/flags are
notoriously problematic due to memterm considerations. 

>However, a new copy can be loaded and
>redefined which is why I like reusable LXs. 
Everyone should like reusable LX's. But you still do have to get rid of the
old one first so there's a window when neither is available. 

>In my book, PC routines are the only way to fly. 
I don't think anyone is disagreeing with you.

I was only pointing out that LOAD with ADDR= is not the way to go.

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive acc

Re: Dynamic LPA Services

2013-07-09 Thread Kenneth Wilkerson

A D6-22 is a linkage exception meaning the LX is not connected to the
address space issuing the PC. For a system LX, this means the LX has not
been connected by an ETCON, the LX has been disconnected by an ETDIS or
ETDES, or the address space that connected the LX has terminated.  For a
non-system LX, it could mean the address space issuing the PC has not issued
the ETCON to connect the non-system LX.

If you do a SLIP COMP=0D6, you can use the IPCS Status display to list all
the linkage tables in ascending LX order. Then you can visually whether the
LX is connected or not.

Kenneth
-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Binyamin Dissen
Sent: Tuesday, July 09, 2013 3:29 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services

You should be aware that ETDEF does not set a return code. It does inline
instructions to build a single entry. The ETCRE/ETCON are the ones that make
something happen.

On Mon, 8 Jul 2013 22:28:59 GMT "esst...@juno.com"  wrote:

:>As the original Poster, I thank every one for there input.
:>The various information provided has been excellent :>Thank You :>I am
still getting the 0D6-022 Abend, and I am Not Understanding why.
:>
:>So Let me level set every one.
:>I Am On a Z/oS 1.4 System
:>I do not have LX Reuse on this system, I don't think that has anything to
do with this issue.
:>
:>I use te CVT to determine LX Reuse.
:> USING CVTMAP,R15 .INFORM ASSEMBLER 
:> L R15,X'10'  .ADDRESS OF CVT   
:> TMCVTFLAG2,CVTALR.LX Reuse Available
01404522
:> BNO   NO_LX_REUSE.NO EXISTANCE
01404622
:>
:>I would like to understand the use of CR0 to determine this, if someone
would post the code.
:>.
:>I am aware of Obtaining Storage in Common and Loading a routine into key 0
SP 241 or similar, I'm trying to gain a new skill by using LPA Dynamic
Services.  
:>
:>In a separate job I Dynamically Add a module to LPA using CSVDYLPA.
:>
:>Then I Start An Address Space and use CSVQUERY to obtain the entry Point
Address of the Module I Added To LPA.
:>The Entry Point Address returned from CSVQUERY is then used in an ETDEF
SET macro that describes a Non Space Switching PC Routine.
:>
:>SET_ETD1 DS 0H
03340004
:> ETDEF TYPE=SET,ETEADR=ETD1,ROUTINE=(2),RAMODE=31,
X03350004
:>   STATE=SUPERVISOR,PC=STACKING,SSWITCH=NO,
X03360004
:>   SASN=OLD,ASCMODE=PRIMARY,
X03370004
:>   EK=8,PKM=OR,
XX03380004
:>   AKM=(8,9),EKM=(8)
03390004
:>*
0344
:> STR15,XMSRESP Save Response Code
03410004
:> BRAS  R14,CHKRESP Go Check Response Code In Reg-15 
:>
:>The Address Space has Not terminated. 
:>
:>Now I want to submit a Job which invokes this Routine via a PC
instruction.
:>the PC Number is D601.
:>Where D6 is The LX
:>01 Is the Second Entry in the Entry Table.
:>However when I issue the PC instruction I get an 0D6 Abend... 
:>
:>Thank You Again for all Your comments.
:>
:>
:>
:>--
:>For IBM-MAIN subscribe / signoff / archive access instructions, :>send
email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
Binyamin Dissen  http://www.dissensoftware.com

Director, Dissen Software, Bar & Grill - Israel


Should you use the mailblocks package and expect a response from me, you
should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems, especially
those from irresponsible companies.

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dynamic LPA Services

2013-07-09 Thread Kenneth Wilkerson

>The point is that SLIP LPAMOD=and the IPCS WHERE subcommand
>will not be able to identify your module by name.  So when someone needs to
refer to your module on a SLIP command, they will need to manually determine
the address of your module in order to use use ADDRESS=  (and the >address
could change every time your product starts, and is likely to be different
of each member of a sysplex).
>As a z/OS diagnosis expert, I view that as a serviceability issue.

Since a server address space is required to define the PCs, the server
provides an operator command such as resolve. The customer issues
RESOLVE,pgmname+disp and the system returns the address and the instruction
at that address for verification. The address can then be cut and paste into
a SLIP command. I have my own WHERE facility that is PC intelligent.

My point is that you don't have to design severs to the lowest common
denominator as long as you are willing to provide services to fill in the
gap. The diagnostic capabilities that the server I write provide are much
greater than what currently exists.

Kenneth
-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jim Mulder
Sent: Tuesday, July 09, 2013 12:16 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services

> You know who owns it because its defined as a PC and therefore has an
entry
> table assigned to it. Looking in the entry tables for a program is 
> just
as
> common a practice as looking for "identified" programs. So finding PC 
> routines just requires different methods. Besides, if this is a 
> stacking
PC
> which is the only type I use, the linkage stack has everything needed 
> to associate the call with the PC routine including the PC number.

  The point is that SLIP LPAMOD=and the IPCS WHERE subcommand
will not be able to identify your module by name.  So when someone needs to
refer to your module on a SLIP command, they will need to manually determine
the address of your module in order to use use ADDRESS=  (and the address
could change every time your product starts, and is likely to be different
of each member of a sysplex).
As a z/OS diagnosis expert, I view that as a serviceability issue. 

> To say that you can't ever free a PC routine is untrue. Almost any 
> space switching PC will terminate as soon as the server that defined 
> it terminates. So these can be released and refreshed as needed every 
> time
the
> server recycles. Certainly, there are many PC routines that can't be
freed.
> But if a PC routine is designed to be called as part of an API or UI,
then
> API/UI recovery can easily recover the error and report it as the 
> server terminating. Certainly, any PC routine that is defined as 
> non-space switching  system PC routine that can be called without any 
> provided interface probably cannot be freed. However, a new copy can 
> be loaded
and
> redefined which is why I like reusable LXs. 

  Consider the case where the storage formerly occupied by the freed PC
routine has been reassigned by VSM, and now contains data that happens to
look like a valid instruction stream.  So now your "PC routine"
is executing unintended code, with the authority of the user and your PC
routine.  What will cause your API/UI recovery to get control, and if it
does get control, how will it "easily recover the error"? 
How will it detect and repair any damage which occurred due to the execution
of the unintended instructions? 


Jim Mulder   z/OS System Test   IBM Corp.  Poughkeepsie,  NY

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dynamic LPA Services

2013-07-08 Thread Kenneth Wilkerson

Control Register 0 bit 44 contains the system setting for LXRES as defined
in the POM in the chapter of control. I'm a Z/Architecture guy and I usually
go to the architecture for settings instead of z/OS. I'm also pretty sure LX
Reuse did not exist in 1.4 though I may be wrong. It was added sometime in
the 1.4 to 1.6 time frame.

In your description, I don't see any reference to LXRES, ETCRE or ETCON.
ETDEF only creates the entry table needed to define PCs.  LXRES reserves an
LX (you probably need a system one) and returns a token. I assume that the
D6 LX was acquired via an LXRES and it's a system LX. ETCRE creates a
working copy of your entry tables in the PCAUTH address space and also
returns  a token. ETCON connects your entries via the LXRES and ETCON tokens
to the address spaces that are allowed access to your PC routines. For
system LXs, that's every address space. The PC numbers start at LX00 and go
to LX## where 00 is assigned to the first entry in your entry table and ##
is the last entry up to 255. If you define space switch P{C more setup is
required. The Extended Addressability manual goes into all this. 

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of esst...@juno.com
Sent: Monday, July 08, 2013 5:29 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Dynamic LPA Services

As the original Poster, I thank every one for there input.
The various information provided has been excellent Thank You I am still
getting the 0D6-022 Abend, and I am Not Understanding why.

So Let me level set every one.
I Am On a Z/oS 1.4 System
I do not have LX Reuse on this system, I don't think that has anything to do
with this issue.

I use te CVT to determine LX Reuse.
 USING CVTMAP,R15 .INFORM ASSEMBLER 
 L R15,X'10'  .ADDRESS OF CVT   
 TMCVTFLAG2,CVTALR.LX Reuse Available
01404522
 BNO   NO_LX_REUSE.NO EXISTANCE
01404622

I would like to understand the use of CR0 to determine this, if someone
would post the code.
.
I am aware of Obtaining Storage in Common and Loading a routine into key 0
SP 241 or similar, I'm trying to gain a new skill by using LPA Dynamic
Services.  

In a separate job I Dynamically Add a module to LPA using CSVDYLPA.

Then I Start An Address Space and use CSVQUERY to obtain the entry Point
Address of the Module I Added To LPA.
The Entry Point Address returned from CSVQUERY is then used in an ETDEF SET
macro that describes a Non Space Switching PC Routine.

SET_ETD1 DS 0H
03340004
 ETDEF TYPE=SET,ETEADR=ETD1,ROUTINE=(2),RAMODE=31,
X03350004
   STATE=SUPERVISOR,PC=STACKING,SSWITCH=NO,
X03360004
   SASN=OLD,ASCMODE=PRIMARY,
X03370004
   EK=8,PKM=OR,
XX03380004
   AKM=(8,9),EKM=(8)
03390004
*
0344
 STR15,XMSRESP Save Response Code
03410004
 BRAS  R14,CHKRESP Go Check Response Code In Reg-15 

The Address Space has Not terminated. 

Now I want to submit a Job which invokes this Routine via a PC instruction.
the PC Number is D601.
Where D6 is The LX
01 Is the Second Entry in the Entry Table.
However when I issue the PC instruction I get an 0D6 Abend... 

Thank You Again for all Your comments.

 

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dynamic LPA Services

2013-07-08 Thread Kenneth Wilkerson

Thank you Walt. I was remembering an issue incorrectly. I certainly am guilty 
of confusing how content supervision handles some aspects of authorization 
which is one reason I stick to PC routines.  

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Walt Farrell
Sent: Monday, July 08, 2013 2:17 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services

On Mon, 8 Jul 2013 07:55:46 -0500, Kenneth Wilkerson  
wrote:

>And it doesn't
>matter what the AC= is for a LPA program. MVS is going to treat it as 
>authorized simply because it's in the LPA. I discovered that the hard way.
>

That's not true, Kenneth.

MVS will certainly consider any LPA-resident module to have been loaded from 
(i.e., resident in) an APF-authorized library, but that is not related to the 
AC=0/1 setting.

Being resident in an APF-authorized library simply means that the system will 
allow another program that is already running authorized (APF, system key, or 
supervisor state) to load the module, and this is true for both AC=0 and AC=1 
modules. If the module is not in an APF-authorized library and an authorized 
program tries to load it in the normal way, the load will fail. 

If the module does have AC=1, and it's resident in an APF-authorized library, 
then IF the module in invoked as the jobstep program by the initiator (or in a 
small handful of other ways)  then the new jobstep will gain APF-authorization 
and run APF-authorized.

If you have an LPA-resident module that is AC=0, and you run it via EXEC PGM= 
it will NOT run APF-authorized. It needs AC=1 for that.

Many people (including some IBMers, and some writers of documentation) seem 
confused by the distinctions between APF-authorized libraries, AC=1, and 
running APF-authorized.

--
Walt

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dynamic LPA Services

2013-07-08 Thread Kenneth Wilkerson

You know who owns it because its defined as a PC and therefore has an entry
table assigned to it. Looking in the entry tables for a program is just as
common a practice as looking for "identified" programs. So finding PC
routines just requires different methods. Besides, if this is a stacking PC
which is the only type I use, the linkage stack has everything needed to
associate the call with the PC routine including the PC number. I rarely use
anything but PC routines anymore since most of the stuff I write is RMODE64.
The IPCS status displays the entry tables for that reason. And it doesn't
matter what the AC= is for a LPA program. MVS is going to treat it as
authorized simply because it's in the LPA. I discovered that the hard way. 

To say that you can't ever free a PC routine is untrue. Almost any space
switching PC will terminate as soon as the server that defined it
terminates. So these can be released and refreshed as needed every time the
server recycles. Certainly, there are many PC routines that can't be freed.
But if a PC routine is designed to be called as part of an API or UI, then
API/UI recovery can easily recover the error and report it as the server
terminating. Certainly, any PC routine that is defined as non-space
switching  system PC routine that can be called without any provided
interface probably cannot be freed. However, a new copy can be loaded and
redefined which is why I like reusable LXs. 

In my book, PC routines are the only way to fly. They can be called in any
amode, any rmode and any environment other than SACF 256 and 768 which are
very rarely used. And there are as many ways to define and use them as they
are people that define and use them. I was just making a suggestion. Peter
is making another. I imagine Peter has trusted methods that work. I know
that I do as well.  If you're going to define a PC, I suggest you don't
allow the old dogma to get in your way. Find the method that works for you.
PC routines are where MVS has been headed for a long time. 

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Relson
Sent: Monday, July 08, 2013 6:45 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services

0D6 - 22: A linkage index (LX) translation exception occurred; the program
interruption code is X'22'. 
This cannot have anything to do with the location of the target routine.

Things added to dynamic LPA are part of LPA. They are built out of (E)CSA. 
What they are not part of are PLPA, MLPA, FLPA which are not built out of
(E)CSA.

The approach of using "directed load" is frowned upon. It does not buy
anything and has detrimental RAS affects, since the storage area being used
as the PC target is now not known by name and thus is harder for any
diagnostician to determine who owns it. There is just about no reason any
more to do LOAD with ADDR to CSA for code.  P.S., do not use LOAD with
GLOBAL=YES if your address space could ever terminate without wait-stating
the system, as the system frees that storage upon such termination.

It is true that someone "could" LINK to the name since there is a name, but
that is never of concern to a properly written program. The LPA routine
should not be marked as AC=1. 

By the way, there are extremely few cases where a PC routine can *ever* be
freed without introducing a system integrity problem. Do you truly know (as
opposed to just hope) that no one is within the routine at the time you want
to free it?

Peter Relson
z/OS Core Technology Design

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Dynamic LPA Services

2013-07-07 Thread Kenneth Wilkerson

I don't know what you're trying to do but I would never define a PC in the
LPA for a lot of reasons. The most basic of these is that LPA routines are
callable by the EP=or EPLOC= parameter on LOAD, LINK, XCTL and ATTACH
services. When called from these services the traditional linkage is
significantly different than PC linkage.  Of course, you might want to be
callable by a LOAD, LINK, XCTL and ATTACH service which means you would have
a separate entry point defined for the PC routine. I use a jump table at the
start of a program to define multiple entry points. 

To define system level PC routine, I normally acquire a CSA control block
that can be found by a system level name/token. Another approach is to use a
word in the ECVT that has been assigned to you. I don't remember the
procedure but a vendor can get one word in the ECVT assigned by IBM to that
vendor. You may use another method. But regardless of the method used to
anchor the control block, the small CSA control block would contain the
EPA/Length and PC assignments. I always define reusable LXs, so I keep the
LX number in there as well. I think reusable LXs are simpler but you have to
check control register 0 to make sure the feature is available.  I then
acquire key 0 SP=241 (CSA non-fetch protected) storage and I do a LOAD ADDR=
into that CSA storage. I can now define the PC. Each time you need to
refresh the PC routine, you'll need to release the old storage, load a fresh
copy and redefine the PC. If you use a reusable PC number, the PC number
(low 32 bits) will remain the same (unless you change it)  but a the
sequence number (high 32 bits architecturally passed in r15) will be
incremented by one. For that reason, I always use R15 as the PC register and
I save the sequence number and PC number in a double word so I can load it
into R15 and do a PC 0(R15).

Since you're getting a D6-22 and you're sure the PC is defined, I suspect
that the defining  address space has terminated. MVS has to have an address
space to own a resource. When you acquire an LX and define a range of PC
routines,  tables are created in real storage and are assigned to the
defining address space. The PCAUTH server defines your PC tables in the
private SQA of the PCAUTH address space. They are disconnected and released
from real storage whenever the address space terminates. If you want PC
routines to persist for the duration of an IPL, you need to schedule an SRB
into a system address space to define the required PCs. The choice of system
address space is yours. The non-space switch PC won't execute In the system
address space. It will execute under the DU control blocks (SRB or TCB) in
the address space of the caller.  

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Chuck Arney
Sent: Sunday, July 07, 2013 3:31 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services

Did your dynamic LPA replace a module that was already established as the PC
routine?  The book says you can't do that, as the PC linkage tables are not
updated by the dynamic LPA service.  If you are replacing a module defined
as a PC you would have to remove the PC and redefine it with the new module
address.  

Chuck Arney
Arney Computer Systems

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of esst...@juno.com
Sent: Sunday, July 07, 2013 3:08 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services

Chuck Areny
"It should work just fine Paul"

Well I tend to agree, I seem to get the old 0D6-22 Abend when I try To PC to
the routine.
I first thought PC number was incorrect however I listed my PC Numbers and
respective PC number is correct. Thats why I posted this question.

Thanks For the Response, I will recheck the code.
Paul D'Angelo

-- Original Message --
From: Chuck Arney 
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Dynamic LPA Services
Date: Sun, 7 Jul 2013 14:12:34 -0500

It should work just fine Paul.

Chuck Arney
Arney Computer Systems

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of esst...@juno.com
Sent: Sunday, July 07, 2013 12:31 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Dynamic LPA Services

i have been woking with the Dynamic LPA services of z/OS (CSVDYLPA).
Im abel to add, delete, and invoke  modules that were dynamically added to
"LPA" using CSYDYLPA and CSVQUERY.

However after re-reading the description of CSVDYLPA, its not really LPA,
its more Common storage. 

So my question is - Should I be able to invoke a Dynamically Added Module as
a Non Space Switching PC Routine. 

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subs

Re: Assember

2013-06-24 Thread Kenneth Wilkerson

TMI2REC+ISTAT-IREC,SDLET

Is equivalent to:

LA somereg,I2REC somereg is R1-R15 
USING IREC,somereg
TM   ISTAT,SDLET
DROP somereg

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Charles Mills
Sent: Monday, June 24, 2013 12:31 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Assember

If you (1) post the 2 or 3 instructions following the TM and (2) post the 
"object code" that appears in the listing to the left of the instruction then 
we can help you more.

Charles
-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Ron Thomas
Sent: Monday, June 24, 2013 8:24 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Assember

Hello.

can some one pls let me know what this assembler code does?

TMI2REC+ISTAT-IREC,SDLET

how the above code work?

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: New Software Tool for z/OS Developers Announced by Arney Computer Systems

2013-04-10 Thread Kenneth Wilkerson

TDF does not use traditional "intercept" technology. TDF never alters any
user code other than user specified breakpoints and it never alters any MVS
code in any way. 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Chuck@arneycomputer
Sent: Wednesday, April 10, 2013 7:54 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: New Software Tool for z/OS Developers Announced by Arney
Computer Systems

I think everyone is aware of that, but of course if not, they should
understand it.  That said, everyone should also know that there is no method
of achieving the end result that is supported by IBM. Therefore, there is no
choice if you need the function. This processing is done using the standard
SVC screening facility that is provided by IBM but they do not stanchion
some of the things that can be done with it.  Keep in mind that this is a
system level debugging product that should only be used in a development
environment. We take extensive measures to ensure system integrity but it is
a very powerful tool that can be misused.  It should never be used in a
production environment. It serves no function for production work.

Chuck Arney

On Apr 10, 2013, at 8:05 AM, Peter Relson  wrote:

>> wrap all content supervision (LOAD, LINK, XCTL and ATTACH), RTM exit 
>> (ESTAE(X), STAE, (E)SPIE and SETFRR) and selected schedule (such as 
>> IEAMSCHD) service calls.
> 
> As all should understand, very little of this would be considered 
> supported in any way shape or form and if anything in this realm 
> caused a problem (or could conceivably have caused a problem), IBM 
> service might take a hard line about helping.
> 
> Peter Relson
> z/OS Core Technology Design
> 
> --
> For IBM-MAIN subscribe / signoff / archive access instructions, send 
> email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Use of the TRAPx Instructions

2013-04-10 Thread Kenneth Wilkerson

Setting up the trap environment requires authorization and can easily be done 
by a non-space switch PC setup by an authorized server. The execution of a 
TRAPx instruction is not authorized and executes under the state of the program 
being trapped.  So, NO, there are no system integrity issues when debugging 
unauthorized programs.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Tom Marchant
Sent: Wednesday, April 10, 2013 5:33 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Use of the TRAPx Instructions

On Tue, 9 Apr 2013 16:17:25 -0500, Kenneth Wilkerson wrote:

>You have to
>be able to acquire key 0 to even examine the DUCT let alone modify the 
>DUCT to define the required trap control blocks. This means, of course, 
>that the application creating the trap environment must be authorized.

Doesn't that mean that it is difficult at best to ensure system integrity when 
debugging non-authorized code?

--
Tom Marchant

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Use of the TRAPx Instructions

2013-04-09 Thread Kenneth Wilkerson

Thanks. I forgot about that one. It is covered in Chapter 22 on Exits in the
Auth Assembler Guide. This would also be classified as a branch entry
intercept. 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Gerhard Postpischil
Sent: Tuesday, April 09, 2013 9:15 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Use of the TRAPx Instructions

On 4/9/2013 7:03 PM, Kenneth Wilkerson wrote:
> So now specifically to asynchronous exits. There are 3 ways to 
> schedule asynchronous exits that I know of, by STIMER(M), by SCHEDIRB 
> and by the old, SCHEDXIT. If there are other ways, please let me know.

There is at least one other - see the CIRB macro. Before HASP, operators on
an OS/360 system had to issue an explicit START RDR command to read a job
stream. I had a parameterized facility (define command, alter, delete by
unit) that caused an unsolicited interrupt to a defined device to trigger
the appropriate command, thus obviating the need for the operator to start
the reader. For MVS I have a version with more flexibility - I can set an
ATI in designated UCBs, and issue any command in response to an interrupt.
This is handy for activating CRT terminals on a different floor (not all are
defined to VTAM).

The MVS version does not use CIRB, nor does it use the Master Scheduler;
instead it calls CVT0EF00 directly to schedule an IRB and IQE.

Gerhard Postpischil
Bradford, Vermont

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Use of the TRAPx Instructions

2013-04-09 Thread Kenneth Wilkerson

Peter, since you did such an excellent job of describing the problem, I've
decided to describe to you how TDF is going to handle asynchronous exits.
Before doing so, I need to clearly state that for the current release of the
product the asynchronous exit debugging is a restriction. Months ago I
looked into the problem and devised a solution but I didn't implement it
because I had many other things on my plate. Second, all of the technology
I'm about to describe exists in TDF right now. It's just a new
implementation of the same intercept technology already being used.

The TRAPx instruction alone is insufficient to accommodate full fledge
debugging. As you mentioned, its anchored off the Dispatchable Unit Control
Table (DUCT) so its scope is a dispatchable unit, a TCB or SRB. In a complex
system involving multiple tasks and multiple address spaces, this scope is
easily exceeded. So a mechanism must exist to automatically extend the scope
of the trap environment as the application grows in scope. The mechanism
used by TDF is called Dynamic Program Intercepts. Without going into great
detail, suffice it to say that TDF intercepts content supervision (LOAD,
XCTL, LINK and ATTACH) , RTM exit and key schedule service calls. So when a
new task is attached, through commands, you can define whether TDF
automatically sets up the trap environment for a new task. You actually
attach a TDF program that sets up the environment and transfers control to
your program with a newly setup trap environment. When you create an RTM
exit, a small program in TDF actually gets control and it wraps your RTM
exits. TDF can also wrap key schedule services like IEAMSCHD so that when an
SRB is scheduled, it initializes the DUCT just as if the SRB had a hook
macro assembled into it. The bottom line is that the MVS implementation (or
lack of it) of the TRAP instruction can be compensated through intercept
technology.

So now specifically to asynchronous exits. There are 3 ways to schedule
asynchronous exits that I know of, by STIMER(M), by SCHEDIRB and by the old,
SCHEDXIT. If there are other ways, please let me know. Since I haven't
really started coding this I haven't finished all my research. STIMER(M) is
by SVC. The existing TDF SVC intercept can be easily extended to include
STIMER (M) . SCHEDIRB and SCHEDXIT are branch entry calls. TDF provides a
mechanism called Branch Entry intercepts which are very special TRAP
breakpoints that redirect execution of Branch entry calls (like SETFRR for
example) to an intercept similar to the way SVC and PC interception works.
Regardless of the type of schedule service, TDF replaces your exit address
with a program that would wrap your exit. The front end would "stack" the
prior trap environment and the back end would "unstack" it.

This is a simple explanation to a complex problem.

Kenneth

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Binyamin Dissen
Sent: Tuesday, April 09, 2013 4:42 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Use of the TRAPx Instructions

Looking as the DUCT, it also contains an indication of base address space or
subspace.

I would think that an ASYNC exit/IRB would not useethe basespace indicater
of the current RB.

On Tue, 9 Apr 2013 16:17:25 -0500 Kenneth Wilkerson 
wrote:

:>You are certainly correct about the z/OS implementation of the TRAPx
:>instruction. I often wondered why the hardware designers decided to
:>implement it such that it inherits the user state and not a predefined
state :>and why they didn't provide a service to "register" trap interfaces
so they :>could be shared or the asynchronous exit issue could be solved.
You have to :>be able to acquire key 0 to even examine the DUCT let alone
modify the DUCT :>to define the required trap control blocks. This means, of
course, that the :>application creating the trap environment must be
authorized.
:>
:> But despite these z/OS limitations,  in my mind, the asynchronous exit
:>issue is just a restriction. It's no bigger a restriction then the fact
that :>the TRAP can't be executed in HOME ASC (SAC 768) or Secondary ASC
mode (SAC
:>256) or in transaction processing. Your scenario involves the asynchronous
:>exit also invoking a TRAPx instruction. So simply stated, asynchronous
exits :>cannot be debugged using the TRAPx instruction without a service to
:>determine if the TRAP control blocks are currently in use and perform a
TRAP :>stack function. But even this problem can be solved with a little
extra code :>(a PC routine) that stacks the current trap save area if it's
in use. 
:>
:>Certainly, the TRAPx instruction has fewer limitations than other
available :>methods. Every method is going to have restrictions. In my
asynchronous :>exits, I typically simply update a control block and post a
task to proce

Re: Use of the TRAPx Instructions

2013-04-09 Thread Kenneth Wilkerson

You are certainly correct about the z/OS implementation of the TRAPx
instruction. I often wondered why the hardware designers decided to
implement it such that it inherits the user state and not a predefined state
and why they didn't provide a service to "register" trap interfaces so they
could be shared or the asynchronous exit issue could be solved. You have to
be able to acquire key 0 to even examine the DUCT let alone modify the DUCT
to define the required trap control blocks. This means, of course, that the
application creating the trap environment must be authorized.

 But despite these z/OS limitations,  in my mind, the asynchronous exit
issue is just a restriction. It's no bigger a restriction then the fact that
the TRAP can't be executed in HOME ASC (SAC 768) or Secondary ASC mode (SAC
256) or in transaction processing. Your scenario involves the asynchronous
exit also invoking a TRAPx instruction. So simply stated, asynchronous exits
cannot be debugged using the TRAPx instruction without a service to
determine if the TRAP control blocks are currently in use and perform a TRAP
stack function. But even this problem can be solved with a little extra code
(a PC routine) that stacks the current trap save area if it's in use. 

Certainly, the TRAPx instruction has fewer limitations than other available
methods. Every method is going to have restrictions. In my asynchronous
exits, I typically simply update a control block and post a task to process
whatever it is that I'm trying to process. 

Kenneth, TDF Architect

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Morrison, Peter
Sent: Tuesday, April 09, 2013 3:37 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Use of the TRAPx Instructions

I have had extensive experience with the use of the TRAPx (TRAP2/TRAP4)
instructions in a z/OS environment.

z/OS offers no support for setting up to enable them.  Basically, you need
to anchor things in the DUCT (Dispatchable Unit Control Table).  There is
one DUCT set up for each TCB and SRB.  (Note that ONLY ONE DUCT is set up
for a TCB - not one DUCT per RB level.  THIS IS VERY IMPORTANT!) (preserving
a specifically formatted DUCT is important, but is not relevant to the
discussion below.  Just be aware that there are issues associated with it)

Generally, you can regard TRAPx as a 'fancy branch'.  The target is the
routine whose address is set up in the control blocks.  The hardware saves
all state in the 'trap save area' first.

BUT, there is a very significant problem when using TRAPx with z/OS!  When
your trap routine gets control, system state is as before the TRAPx
instructions was executed.  This includes the fact that interrupts are
probably enabled.  Why does this matter?  Because, in z/OS, in TCB mode, if
an interrupt occurs, processing of that interrupt could involved
de-scheduling the TCB, deciding to request a new RB level, and later,
redispatching the TCB, so that the new RB level will get control.  

This can lead to the following scenario:

1: TCB-mode code executes a TRAPx instruction
2: The hardware saves all state (PSW/Regs) in the Trap Save Area
3: The registered Trap handler routine is given control and starts
executing...
4: an interrupt occurs
5: During processing of the interrupt, the current TCB has a new RB level
stacked over it
6: The TCB is resumed.  Execution now is for the new RB level
7: The new code executes a TRAPx instruction
8: The hardware saves all state in the Trap Save Area.  BAZINGA!  The old
information is overwritten!
9: The registered trap handler routine is given control and starts
executing...

Because the trap save area  has been overwritten, the lower-level handler,
when it resumes execution, is not using correct information.  There is not
even any way to know that this has occurred.

While the situation CAN be circumvented by preventing asynchronous RB
stacking (there is a bit in the TCB for this), this can play havoc with
debugging as, for example, asynchronous exits to do with I/O won't
execute...

For the above reason, use of TRAPx instructions as a way to implement
breakpoints in code that executes on z/OS in TCB mode is not a good idea...

Peter Morrison
Principal Software Architect
CA
6 Eden Park Drive
North Ryde NSW 2113
Tel: 02 8898 2624
Fax: 02 8898 2600
peter.morri...@ca.com

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: New Software Tool for z/OS Developers Announced by Arney Computer Systems

2013-04-09 Thread Kenneth Wilkerson

Howdy,

 

My name is Kenneth and I'm the architect of TDF. I thought I would take a
few minutes to clarify a little about TDF. This is my first time doing this.


 

First, TDF is designed to be much more than an interactive debug tool. It
wasn't designed to compete with any existing products. It's primary purpose
is to expand the realm of debugging tools outside of the development scope
into testing, maintenance and customer problem determination. The first
release is primarily about the interactive component because it's the part
that is currently tested to our standards and we feel can be used reliably.
Consider, the example of locked code. TDF is carefully architected so you
could interactively debug any locked code except disabled code and code in a
CPU lock. One day TDF might actually support this mode but it would require
more code than is currently justified. But TDF is designed to provide
non-interactive data collection. Through panels to define 'traces', you can
specify what states and data you want to collect up to 2K per trace point.
Each trace point can be tailored to the specific needs of that trace point.
So TDF can be used to provide a dynamic trace to any code without any
modification. 

 

Now consider this. TDF has a scripting capability where all the commands
needed to perform a trace or debug are recorded.  A customer reports a
problem. You design a set of traces to collect the needed data. You send the
script to the customer. Since TDF does not require any code modifications,
they start up a batch runtime component that executes the script against a
test case. It collects the trace data which can be shipped back to the
product developer for analyses. A fix is prepared and shipped to the
customer. The same script can now be run again to verify the fix. 

 

That is what TDF is designed to do. It's an entirely different debug
paradigm that expands debugging tools into the realm of maintenance and
problem determination. 

 

Second, TDF has no boundaries.  TDF is dynamic and can operate across any
number of address spaces, tasks, SRBs, PC routines and RTM exits. It does
this by using the TRAP instruction. This instruction can execute in almost
all  environments. Without going into details, essentially it can execute
where ever a PC instruction can execute. Essentially, TDF uses what we call
Dynamic Program Intercepts to wrap all content supervision (LOAD, LINK, XCTL
and ATTACH), RTM exit (ESTAE(X), STAE, (E)SPIE and SETFRR) and selected
schedule (such as IEAMSCHD) service calls. This list will grow as demand
dictates. It also uses this same technology to wrap user 'identified' PC
routines and common routines. It does this by making a copy of the
identified code thus isolating it from any other callers. In fact, two or
more sessions can debug the same PC concurrently. A future enhancement
(still being tested) will allow the grouping of any number of tasks and
address spaces into a debug group. This will become essential for problem
determination in complex task or server/client scenarios as described in the
runtime component in the second paragraph. 

 

Third, IBM is a hardware and software vendors. It has the luxury of pairing
the hardware, Z/Architecture, and software, z/OS, architectures into one of
the most powerful, if not the most powerful operating system. TDF is
designed to exploit both architectures. The TRAP instructions are a simple
example of that. The PC screening technology is another. In fact, TDF is
architected more on the Z/Architecture that z/OS. It requires z/OS to
execute but it is much more reliant on the Z/Architecture. TDF only uses 3
IBM services in the debugging of a dispatchable unit. 

 

Anyone that has any specific technical questions about TDF simply need to
ask.

 

Kenneth

 


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

55 matches

Mail list logo