CA MSM First Contact

Joel Ewing Thu, 12 Aug 2010 17:59:07 -0700

A long review:

After seeing some of the favorable comments on ibmmain on CA MSM 3.0, I
was encouraged to try it out to see if MSM really did simplify things,
and my results so far have been more mixed than some of the previous
comments on the product.  Perhaps some of my experiences may save time
for others.


I like the concept of having a consistent and hopefully more permanent
interface with CA for all CA products, especially in terms of product
download and maintenance conventions, and we will probably eventually
try MSM on all our CA products, but getting MSM to the point that this
might be possible has definitely been a struggle.  The rosy marketing
hype that implies that MSM allows novice SysProgs to do CA maintenance
glosses over the likelihood, in my experience, that installing and
maintaining MSM itself and the interfaces that make it work, and
establishing installation conventions for usage would be difficult for a
novice.  All the nice easy-use demos start with a functional MSM.  Until
I see evidence to the contrary, I would be very wary of assuming that a
novice SysProg would be able to handle installation and maintenance of
MSM itself.

I think part of the problem is that with a product like MSM that deals
with both UNIX and traditional MVS components, you have a whole new
realm of possible differences and incompatibilities in security and and
other local installation conventions and practices that are just not yet
well understood by product designers.  There appear to be implicit
assumptions made about our installation environment that were just not
valid.  Both MSM and CAICCI introduce new jargon and acronyms, and I got
the strong impression that much of the documentation was written by
those so immersed in the jargon that the forest gets lost in the trees.
 There is no grand overview any where that I can see to quickly convey
MSM to a new user, for example:
        That the underlying MSM maintenance philosophy is that each separate
product release have a set of TGT/DLIB libraries and corresponding CSI
with product-release-specific prefixes used for maintenance-only, and
that generation of separate set of related-name run time libraries is
done for each separate production environment.  Understanding that
unlike z/OS, SMP/E target libraries will not be used directly for
production is essential to understanding how MSM can function.
        An overview of the various address spaces involved with MSM (4
long-term plus other short-term), what types of functions in MSM cause
tasks to to farmed out to other address spaces or the creation of other
address spaces for running tasks.
        An Overview of the interaction of all the various definitions in MSM -
e.g. downloading catalog tree requires Software Acquisition settings for
System and User, http access to CA, lack of My Products list on CA
Online, etc.;  downloading product release maintenance requires Software
Catalog System Settings and ftp access to CA; Applying product release
maintenance requires the product CSI  defined under SMP/E Environments
(from MSM install or CSI migration), the CSI to be in the MSM CSI
"working set", appropriate settings under System Settings Software
Installation, and no direct production use of CSI target libs, etc.;
Deployment requires System Registry Definitions, User Settings Remote
Credentials, a Methodology definition, and a Deployment definition, and
appropriate CA-CCI setup to validate System Registry entries and run the
deployment dataset generation task.  A new user can eventually deduce
these interactions, but an overview would save much time and trial and
error.  I think I now understand the basics, but still run into
occasional surprises.
        What tasks generate new filesystems and when and how these filesystems
may be deleted

In fairness to CA, I would be willing to bet that the IBM counterpart to
MSM which is also getting similar marketing ease-of-use hype and which
depends on WAS, is probably also glossing over the issues of product
installation.  And at least the initial reports we got back from SHARE
suggests the IBM counterpart may assume you have several GiB of real
storage laying fallow, compared to the 300-400 MiB real (on a relatively
unconstrained system) I see so far to support MSM.

>From the point of dealing with MSM prerequisites, starting to play with
MSMSetup, getting partial functionality in a test environment, to
getting it to the point of finally doing a successful product deployment
on production took me about 2 weeks!  I found there to be many things
"assumed" in the documentation, or in some cases explicitly stated, that
just weren't correct in our environment.  Both MSM and the additional
pieces of Common Services required to support it introduce terminology
and conventions which may be obvious to CA, but are definitely not
obvious to us when coming from another paradigm and being unfamiliar
with MSM and CA-CCI internal design.  Getting MSM set up and running
required a long sequence of resolving one error, and then advancing to
the next.  Learning how the various definitions in MSM interact with
each other was also a trial and error process.

The following is a list of some of the major struggles along the way:

        If you haven't ever had a need for Common Services CA-ENF, CA-CCI
before,  the first hurdle is to figure out how to install and configure
these.  Right off the bat you have to figure out what other components
are required.  CA-ENF lists Datacom/AD as a pre-requisite and MSM has a
Datacom component (Datacom/MSM), which suggests a relationship, but it
turns out that unless you have other CA products that require
Datacom/AD, CA-ENF can run without it and MSM doesn't require it; but
you do have to explicitly configure CA-ENF for "NODB"
.
        The discussions in Common Services documentation and MSM documentation
on CA-ENF and CA-CCI and the examples all seem geared to multiple
systems and communication with remote systems.  No where does there seem
to be any discussion of, or examples for, the minimal requirements for
the trivial case where all you have is MSM running on one production
system used to deploy to that same system.  It seemed reasonable to me
to interpret that the CA-CCI spawn support only applied to remote target
systems - later when I got to MSM deployment I discovered that
assumption was incorrect, and that CA-CCI also needed the ability to
fire up two other address spaces.  At least the lengthy CCI
configuration  discussion of protocols and additional related PROCs for
remote communication did turn out to be unneeded with a single system.

        CA-ENF runs in its own address space, and you don't want to loop it
unnecessarily as the address space may be non reusable until next IPL.
By default ENF puts out an inappropriately-highlighted red message when
it successfully starts, which will startle an awake Operator, as red is
normally reserved for serious system problems.  It also puts out red
messages when it shuts down and requires Operator response to a WTOR to
shutdown, which is somewhat overkill when the only usage is MSM.  This
requires adding system automation support to handle a quick normal
system shutdown.  Unlike some other-vendor products, there is no
alternative "I really mean it"  MODIFY shutdown command  variant to
bypass the WTOR.

        CA-CCI runs under the CA-ENF address space, provided you include
correct parameters, catch a parameter that must be uncommented
(DCM(CAS9DCM3), and add an additional required DD for spawn support.
These requirements are split across multiple manuals, and I got an
erroneous impression from some of the documentation that I might not
need all of this for a local system with no remote systems. CA-CCI can
also fire up some additional spawned tasks to support MSM tasks  and
which run in additional address spaces and require PROCs be set up
appropriately.  This also introduces an undocumented RACF requirement
for CAENF to be authorized to appropriate RACF OPERCMDS profiles to
issue a "START" command.

        MSM requires a ZFS/HFS filesystem for installation setup.  We dislike
having to modify permanent mounts in BPXPRMxx, and seeing no
counter-reason for not doing so, I created this as a filesystem  that
would be automounted at /u/msm.  We, and probably most using automount,
have automount filesystems default to "nosetuid", but this was not
immediately remembered.  This caused some problems during the MSMSetup
process which we resolved by running MSMSetup as superuser on our test
system, but it was not until trying to start the MSMTC Tomcat address
space that we finally got an explicit failure that indicated the problem
was that we were running from a nosetuid filesystem.  Putting in
explicit /etc/u.map definitions for /u/msm to allow "setuid" resolved
that issue, but we later encountered other problems with MSMSetup that
still required running it as superuser.

        Unix commands are used to start MSMSetup, but under the covers this
process uses ISPF browse and edit, so you must run this under TSO OMVS,
not under an MVS Telnet session.  This is not stated and not obvious
until trying MSMSetup.

        When MSMSetup requests a userid and password, this password must be one
that is valid for MVS FTP usage (we have some subtle user-specific,
extra requirements for password syntax for FTP that make it difficult
for ftp scripts to revoke an arbitrary userid by password failures).

        During initial attempts, I quickly found that MSMSetup would not
proceed until CA-ENF and CA-CCI were active (not the same as fully
functional, as we later discovered), which for us took an IPL on
production.  The documentation is clear that these CCS components are a
requirement to run MSM, but it is not clear they are required even to
run MSMSetup.  I proceeded to test MSMSetup in a Test MVS environment
until next IPL.  This allowed eventually getting to the point of
activating MSMTC, but at that point I was somewhat limited in what I
could test once I verified a suspicion that ftp outside the corporation
wasn't possible from that environment.  This did reveal an interesting
user interface problem in that the only message that gets reflected to
the MSM user for ftp failures is "Failure in PAS processing".  Would you
believe the documentation and Help does not define "PAS" anywhere?  I
finally found a diagram somewhere in one of the manuals that showed
"PAS" as a component sitting between MSM and CA Online, but by that time
using TSO/ISPF/SDSF I had already stumbled across some SYSOUT associated
with MSMTC that explicitly indicated an FTP failure to the CA site, and
was able to confirm that current local FTP restrictions were the
problem.  A more useful MSM error response ("failure in establishing FTP
connection to CA" perhaps) would have made the real problem obvious.

        Using RACF to limit access to MSM seemed a reasonable way to go, but
documentation on RACF requirements for this, and RACF requirements just
to install MSM in general, tended to be of dubious accuracy.  The
example to set up CAMSM profile class name was missing required
parameters (POSIT) and requires a class name that violates IBM
recommendations and generates a warning.  ADDGROUP example has invalid
parameter (NAME).   There is no caution to activate the CAMSM class and
activate GENERIC support before defining generic profiles for the class.
 Generic profile examples have trailing "*", when a "**" is required.
No profile is included in the example for covering Task Deletion added
in MSM 3.0.  More seriously, the parts of MSMSetup that internally
switch to setuid 0 run under the RACF userid specified for SUPERUSER in
the installation BPXPRMxx parmlib member, and one finds that MSMSetup
implicitly [and erroneously] assumes this userid will have appropriate
levels of access to MSM-related MVS datasets and to other FACILITY
profiles that are only documented as a requirement for the userid
running MSMSetup and the userid running MSM address spaces.  One would
not want to give this authority permanently to the BPXPRMxx Superuser
(certainly not the MVS MSM dataset access), but it must be done
temporarily or MSMSetup fails.  Initially running unintentionally with a
nosetuid filesystem actually avoided these errors (while introducing
others).

        Various RACF, setuid, superuser issues causing failure during running
of MSMSetup, can get you in a situation where MSMSetup is convinced it
must do a restart but can't, because it assumes some file has been built
that wasn't built because of the previous failure.  The only safe
recourse seemed to be to restore the msm filesystem back to pre-MSMSetup
state and start over again.

        MSM itself requires three additional address spaces MSMMUF, MSMDBSRV,
and MSMTC, which must be started in this order and shutdown in reverse
order.  MSMTC burns a significant amount of CPU for start up, and can
use a significant amount for some MSM tasks, so you want it in a service
class that doesn't outrank your "loved" ones.  In my limited experience
so far around 96% of CPU used by MSMTC was zAAP-eligible.  For some
unexplained reason MSMDBSRV does not appear to be designed to shut down
with a standard stoP command:  you must start a batch job or another
started task to shut it down (Yuk).

        The first time you start MSMTC, it needs the authority to create a
dataset with its own userid as a HLQ.  The documentation implies this is
only a requirement for users that logon to MSMTC, not for the MSMTC
userid itself.

        First logon for a user to MSM asks for URL and ports for http proxy and
ftp proxy and CA support logon credentials, but it does not ask for
authentication credentials for your proxy servers.  If your proxy
requires this, than an attempt at this point to "Update Catalog Tree"
will fail with "Failure in PAS processing", which in this context was
eventually deduced to really mean "Failure in http communication with
CA".  After much hunting one discovers Settings-> User
Settings->Software Acquisition and a place to define proxy userid and
password settings.  Then "Update Catalog Tree" works, but list of
software products is incomplete (MSM missing).  Additional research:
One must logon to CA web site and delete any products from "My Products"
lists.  Retry, and get a different kind of failure.  In general the
default failure message displayed to the MSM user rarely give a clue to
root cause.  Detailed task output messages (which takes some stumbling
around to locate the first time) reveals I must do additional
customization for "My Account" settings on CA Online to select "Branded
Products".  CA Web page already indicates that setting, but after
re-save of settings now MSM "Update Catalog Tree" finally works as expected!

        Decided to try migration of CCS 12.0 and MSM 3.0 CSIs into MSM using
default selections.   Appeared to work OK.  Asks whether new CSIs should
be added to "working set", another new CA terminology - in this case
having nothing to do with page frames.  I discover later that downloaded
PTFs won't be cross checked with a CSI and maintenance can't be done
against a CSI unless the CSI is in the "working set".

        Next I decided to try a download of maintenance for MSM 3.0 using the
Software Catalog and selecting "Update Catalog Release" (which really
means "Download Release Maintenance").  Turns out in retrospect MSM
itself is a "bad" product to test this on, as this downloads all PTFs
and Informational APARs against all FMIDs of the product since the base
level of the FMID, and some MSM FMIDs have maintenance all the way back
to 2005!  Initially I had some security issues that were causing a
partial failure during the download processing, but you couldn't tell
this from the MSM user interface, which just indicated more and more
packages were being processed (up to 400 before I found proof it was an
infinite loop).  From looking a MSMTC address space SYSOUT it became
evident the MSM task was looping on a failure rather than making
progress, and I had to terminate MSMTC to terminate the task.  After
fixing the problem, I ended up with 973 items being downloaded for MSM,
with only 5 or 6 of the PTFs being outstanding maintenance.  This is a
major difference from IBM current download strategies that only download
PTFs missing from a specified CSI zone.  In products like MSM with older
FMIDs, the CA approach creates a lot of unwelcome visual clutter.  You
can manually delete items that are no longer relevant, but you soon find
you don't want to do this because the next time you download you have
the overhead of bringing all the deleted items down again.  MSM won't
let you Apply PTFs  again that have already been applied, so extra PTFs
are just an annoyance and take up filesystem space; but, you will have
to develop conventions for knowing which informational APARs have
already been considered to avoid revisiting obsolete material on the
next maintenance cycle.

        Attempting to apply MSM maintenance found two prerequisite PTFs
(VE0SP05 and VD0SP05) missing.  There was no clue how to resolve, no
clue which PTFs contained the prerequisite requirement, and you cannot
even proceed to RECEIVE the PTFS into the SMP/E zone (which would allow
you to easily determine the source of the requirement) without first
resolving the missing prerequisites.  Only recourse is to check each
individual PTF to find which have the missing pre and manually exclude
those from maintenance in order to proceed.  Not that difficult with 5
PTFs to apply, but I would hate to do this with 100.

        Those accustomed to the SMP/E dialog options of inspection and
intervention prior to each job in a RECIEVE, APPLY CHECK, APPLY, ACCEPT
CHECK, ACCEPT maintenance sequence may have to rethink maintenance
strategies for the MSM environment. In MSM, RECEIVE/APPLY and ACCEPT are
two completely different tasks and are initiated differently.  Both have
options for also doing the "CHECK", but under MSM it is essentially
useless.  Both missing prerequisites and HOLD issues are resolved in
MSM.  Once MSM decides it has all the information, all SMP/E jobs run
back to back with no chance to inspect results before the next fires.
There is no opportunity to make changes and redo a CHECK step if there
is some problem or the wrong datasets are being used.

        In our usage of SMP/E for various products we have tended to follow the
z/OS example of using SMP/E target libraries as the live production
libraries. Our installation in the past has typically done minor
maintenance to products by doing an APPLY CHECK to find the few affected
libraries, cloning those libraries, modifying SMP/E CSI to point to the
clones, running APPLY CHECK to verify only the intended clone libraries
are updated, running APPLY, and finally staging the updated clone
libraries into production.  That strategy won't work under MSM.  The MSM
design appears to require that SMP/E target libraries only be used for
maintenance and that an entirely separate copy of the libraries be used
for running production on a target system.  Not an insurmountable
problem, but to make CCS maintainable under MSM I will first have to
create a run time set of libraries and locate all the various places
where there are currently direct references to SMP/E target libraries,
as the MSM strategy was not known when CCS support for MSM was installed.

        The deployment process (creating a set of run time libraries for a
system) introduces new CA buzzwords "System Registry", and "methodology"
(which really means "dataset name mapping" (why not call it that?)).
The System Registry is used for defining explicitly or implicitly the
MVS target system on which the run time libraries will be built.  Where
only one system is involved, the simplest choice appears to be to define
a "Staging System", which if SMS controls allocation requires only a
name to build libraries on the same system as MSM.  To me this is a
"local" install, not a "remote" one, but confusingly MSM treats this as
a remote target.  After trying to guess what the error messages are
trying to convey (some unintentionally humorous, like "x is not remotely
valid"), you finally come to the conclusion that the local system must
be defined as a "sysplex" or "non-sysplex" as if it is remote, and you
must supply "Remote Credentials" (Settings -> User Settings -> Remote
Credentials) for submitting jobs on the local system.  Even though up to
now MSMTC has had no problem initiating address spaces for tasks under
your userid that create data sets on the local system, for deployment on
the local system it uses a much more complicated interface through CA-CCI.

        The final counter-intuitive behavior has to do with the actual defining
of Remote Credentials, and can cause authentication failures on the
deployment task.  It makes no difference if all your z/OS systems are
running RACF with “MIXED CASE PASSWORD SUPPORT NOT IN EFFECT" and all
the other logon interfaces on z/OS fold lower to upper case before
validation.  The MSM deployment interfaces are not smart enough to know
this, do not fold lower case to upper case, and credentials with lower
case in the password will fail.  The HELP tells you the credentials are
case-sensitive, but that doesn't tell you what you really need to know:
that if RACF mixed case is not enabled, you must use upper case only.
Unless you work daily with RACF internals, one tends to think about RACF
passwords being case-insensitive or even lowercase rather than by
default being uppercase, since everyone keys in passwords in lower case
and they are folded under the covers.
-- 
Joel C. Ewing, Fort Smith, AR        jcew...@acm.org

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

CA MSM First Contact

Reply via email to