[CMS-PIPELINES] Comparison of UNIX piping, Pipelines (and JOB) ALU usage.

Hobart Spitz Wed, 06 Oct 2021 12:31:20 -0700

Cross posted to IBM-MAIN and CMSTSO-Pipelines lists.

Overview:


   1. An ALU (Arithmetic Logical Unit) load estimation shows CMS/TSO
   Pipelines is orders of magnitude more ALU efficient than UNIX piping.
   2. UNIX piping is significantly better than JCL step-to-step data
   passing and, consequently, Pipelines is also better.  (P>U, U>J, therefore
   P>J.)
   3. It will take a lot of pressure to get TSO Pipes or BatchPipes into
   the z/OS base.
   4. Customers will only bring pressure if they have good reason, for
   example, by using BatchPipes, which includes TSO Pipes..
   5. Installing BatchPipes is one way to experience Pipelines and the
   benefits of BatchPipes fittings in JCL
   6. A few successful pilot programs might break the ice and make this
   "piping-on-steroids" tool available to all z/OS users.

Details::

Below is the output from a program estimating the ALU usage of Pipes versus
that of UNIX command piping.  I chose to estimate the number of characters
reaching the ALU (Arithmetic Logical Unit)  because the measure might be
less affected by hard to account for factors, like system load, instruction
pipelining, out of order execution, and caching behavior.  If there were
just a few referenced fields in a record and they were close to each other,
only one or two cache lines might need to be loaded in Pipes.  Conversely,
if there were many referenced fields and they were widely spread around the
record, more cache lines might need to be loaded.

Comparison of UNIX piping vs. CMS/TSO Pipelines based on
estimated characters reaching ALU.
Fraction of bytes used by average Pipes stage: 0.25
UNIX piping ALU usage / Pipelines ALU usage
+-------+--------+--------+--------+
|       |______ Record Sizes ______|
|Stages |     25 |    250 |   2500 |
+-------+--------+--------+--------+
|    10 |     55 |    134 |    157 |
|   100 |    548 |   1342 |   1570 |
|  1000 |   5479 |  13423 |  15699 |
+-------+--------+--------+--------+
Note:  UNIX performance drops with both more stages and longer records.
Working set size was not taken into account.
Accounting for working set size would involve 64K*Stages for UNIX.
That would increase the Pipes advantage even further.


0.007000 elapsed seconds.


Note that both increasing the number of stages and increasing the logical
record length independently improved Pipelines performance.  This is not
surprising as, in UNIX, data must be copied from the input buffer to the
output buffer at each and every stage.  In Pipelines, such copying rarely
occurs and only referenced data need reach the ALU.  File size has no
significant bearing on this calculation.  As noted, taking working set size
into account probably would show even better numbers for Pipelines.

These calculations are guesstimations.  That they show an
orders-of-magnitude difference means that exact numbers are not important.
What is important is that a 100 fold or 1000 fold improvements are
possible.  Here is a snippet showing the basics.  The rest was formatting.:

...
/* Est. Bytes used (10%) + changed rcds (10%) + delayed rcds (5%).    */
PipesBytesUsedFract = 0.25
/* In other words, fraction of file bytes that actually enter ALU.    */
...
/* Estimate number of bytes per record that are used by the ALU.      */
/* Account for UNIX DBCS, UNIX stage write, and Pipes structures.     */
UNIX:
  return LR.iLR * Stages * 2 * 2


Pipes:
  /* Parm pointer (4) + #Record descriptor (8)                        */
  return LR.iLR * PipesBytesUsedFract + 12
...


Consider the cell in the middle row and the middle column (1342).  If a
process ran for an hour with heavy UNIX piping, it could take as little as
2 seconds (3600/1342=1.937) using Pipes.  (If it was I/O bound, it might
take longer.)  On the other hand, what takes multiple UNIX piping commands
might be just a single Pipe command.  Real world testing would be best.
That said, ALU usage is tied to working set size, cache load and
CPU usage.  (CPU usage includes the whole instruction pipeline.  ALU is
just the final stage, which operates mostly in parallel with the rest of
the instruction pipeline.)  Further, this might explain why IBM suddenly
added multiple additional cache levels at the same time they were pushing
UNIX and C.

If you could get the Flash on your baseball team, wouldn't you have him
running bases?  (Hmmm.  I've never liked the names BatchPipes and
BatchPipesWorks.  It's as if someone wanted to doom the products by giving
them awful names.  How about FlashBatch and FlashPipes?)

These huge numbers might explain some of the IBM resistance to Pipes in the
z/OS base.  Major revenue leveling could be a valid concern (but a windfall
for customers).  Realistically, it would depend on the rate of adoption and
reimplementation to Pipes, and could be mitigated by appropriate marketing
efforts to attract new customers.  Could there be a legal argument that,
since hardware revenue growth would be reduced and/or global warming
mitigated, making BatchPipes or TSO Pipes part of the z/OS base is exempt
from any barrier to IBM offering free software?

Other ways to get these kinds of benefits are with CMS Pipelines (in the
z/VM base) and on the iSeries, where I/O and data passing (AFAIK) is more
pointer/locate-mode oriented due to the large memory model.  Installing and
using BatchPipes is far less costly than rehosting, IMHO.

Perhaps VMers getting this can share how they got indoor plumbing for z/VM.

Much as I am uncomfortable with doing the job that IBM Sales Reps. should
be doing, if you understand why Pipes is so fast and productive, please do
whatever you can to raise awareness of the product.  There might be a free
trial for your site.  If you have long running JOB(s) and/or ones that run
frequently, try using one or more as pilot project(s).  Many people are
familiar with simple piping (UNIX style); Pipelines adds some new concepts
and behaviors.  I'm sure there are lots of people on both lists who are
more than willing to help out (within reason).  You may even be able to
find someone to take it on for a fee, hourly rate, or contingency.

Start like this:

   1. Get the runtimes for the JOBs that run frequently or are of long
   duration.
   2. Sort the list by ElapsedTime * RunsPerMonth, descending.  Your best
   candidates will be at the top of your list.
   3. Among those candidates, find the steps that take the longest to run.
   Where I/O time is a large portion of step elapsed time, you may have good
   results there.  Most of the time it will involve sequential datasets, but
   it could involve DB2, VSAM, ISPF, or USS bfs.  Pipes interfaces with all of
   them.
   4. If you need to narrow down your list, use VIO to confirm where the
   best potential I/O savings are.
   5. Where that helps, then look at BacthPipes fittings instead of
   utilities and trivial function programs.
   6. Redeploy JOBs that are still long-running or resource heavy using
   batch TSO, TSO Pipes, and REXX.  For maximum module reuse create a JCL proc
   that duplicates your foreground environment.

Summary of BatchPipes and CMS/TSO Pipelines benefits, from this and other
analyses:

   - Greatly reduced development, maintenance and deployment costs.  Staff
   is more expensive than hardware and is in short supply, especially now.
   - Faster development, maintenance and deployment.   I once rewrote a
   600+ line COBOL program in about a dozen Pipes stages.  Most of the time
   was spent analysing the COBOL code.
   - Better hardware utilization and/or lower costs.
   - Faster batch thruput.
   - Modernization of legacy systems at low cost and low risk.  (A recent
   podcast said this is a critical issue with some legacy systems.  NJ
   Governor, are you listening?)
   - Consistency in and flexibility of production support tools.  Pipe
   fittings can be used on almost any DD.
   - Low incremental deployment costs, especially versus rehosting to a new
   platform.
   - Ability to apply changes in phases to the most resource intensive and
   long running processes first.  This means much less disruption, cost, and
   risk than rehosting.
   - Flexibility to adapt to changing requirements.  Fewer "big bang"
   projects where the business needs change half way through a 5 year project.
   - Mitigating global warming.  We must do everything we can, even if it's
   too late.  (The IPCC has been using linear models that have consistently
   undershot actual values.  Are we already exceeding the carrying capacity of
   the planet?  Some scientists are using terms like "overshoot" and
   "collapse".)
   - Both BatchPipes and TSO Pipes have highly productive learning curves.
   You can do a lot with just a little training.  The more you know the more
   you can do.
   - For those that think UNIX and derivatives are the future, watch out
   for real, locate-mode piping.  Should you think that  UNIX's piping method
   is tin-can-and-string versus locate-mode piping's smart-phone?  It wouldn't
   hurt.   You rarely see a UNIX command with more than a dozen stages, while
   Pipes can go into the 100s and 1000s of stages.

So, rather than moving away from z/OS, customers should stay where they
are, and use Pipelines and BatchPipes.

If there is anyone wanting to rock the boat to make the world a better
place, here's your chance.

Thank you for your attention.


OREXXMan
Would you rather pass data in move mode (*nix piping) or locate mode
(Pipes) or via disk (JCL)?  Why do you think you rarely see *nix commands
with more than a dozen filters, while Pipelines specifications are commonly
over 100s of stages, and 1000s of stages are not uncommon.
REXX is the new C.

[CMS-PIPELINES] Comparison of UNIX piping, Pipelines (and JOB) ALU usage.

Reply via email to