David - It's strange that the TSM server Activity Log shows no ANR message of any kind relating to the dsmadmc session loss: I would expect at least some record of the session drop from its end. Based upon what you report, it would appear that the TSM server was not responsible for the dsmadmc session dropping, which is to say that there is no TSM timeout involved. I would thus look for other causes. Check for any incidental entry in the dsmerror.log. One place to look is in the AIX accounting records, searching for the dsmadmc process name, and to particularly look for ac_flag having the AXSIG (Killed by signal) bit being set, which would indicate that the session met a fate involving an OS event. If so, I would further look in the AIX Error Log for any record of the process demise, which would reveal cause. Miscellaneous things can cause mysterious terminations, the Tcsh autologout (http://www.erdc.hpc.mil/documentation/Tips_Tricks/ autologout) being a gross example. If you're going through a firewall facility of some kind, there may be some port use termination therein, based upon excessive duration.
I run dsmadmc 24 x 5, and never see any session loss, per se. There is the standard ANR0482W "session termination" based upon the server IDLETimeout value, but that's "under the covers" and does not result in dsmadmc process loss: upon next keyboard action, the interaction resumes (ANR0402I) within the same ongoing dsmadmc process, which is to say no TSM login required. One thing for sure is that your script is way too simple, lacking any error handling, beginning with return code/status evaluation between command invocations. I would recommend using Perl, where you can readily program error detection, handling, logging, and recovery. what I can think of, Richard Sims On Sep 20, 2007, at 1:23 PM, Taylor, David wrote:
I see the "ANR2017I Administrator ADMIN issued command: BACKUP STGPOOL..." in the actlog at the time that the command was first issues. The ba stgpool started and was running fine. 12 hours (exactly) later, the script reported the "ANS1017E session rejected" and went on to the next command, which was "ba db...", however the ba stgpool was still running. There was nothing recorded in the actlog when the script reported the ANS1017E other than the commencement of the ba db command. It appeared that command line 'dsmadmc -id=admin -pass=$TSMPWD "ba stg collgoldprimarypool collgoldcopypool maxpr=3 wait=yes"' timed-out waiting on a return from the command. It,or all practical purposes, orphaned the ba stg and continued processing the rest of the script. I can find no time-related values in the server's configuration that is anywhere near 12 hours (43,200 seconds). I guess that at this point, I am comfortable in understanding what happened, and am now interested in finding out if that timeout can be adjusted.