We are having an intermittent problem with SFS and I'm hoping someone
may have some ideas of what to pursue next.
 
We have several batch jobs that run under VMBATCH overnight.  Sometimes
they are not able to create a file in a directory, even though most
times it is successful.  The only differences in the executions are the
file names; for many of these the File Type is the date.
 
In the job I am most familiar with, these are the specifics.
 
The job runs Monday-Saturday.  This year, it has failed on January 4,
January 12, February 9, March 18, March 25, and April 13.  It has run
successfully the other days.  Other than the QUERY statements below, it
has not changed.
The job runs in a work machine, WORK7.
The job is submitted by the User ID of the database owner.
The SFS directories are owned by a 3rd user.  Failures occur in many of
the subdirectories, not just one subdirectory owned by this user.  This
user is the owner of most of the directories containing the data files
we create in batch, so I don't think it's significant that it's the ID
that has the problem.  However, as far as I know, it is the only ID that
does have the problem.
This job uses VMLINK to acquire a write link to SFS directory.  This
always looks to be successful--no error is given.  (Other jobs use
GETFMADDR and ACCESS to acquire the write link to the directory.  This
always appears successful as well).
Once the file is ready to be copied from the Work Machine's 191 disk to
the SFS directory, the intermittent error appears.  The vast majority of
the time, the write is successful.  However, sometimes, the job gets
this error message: 
DMSOPN1258E You are not authorized to write to file XXXXXX 20110413 Z1
 
The file is not large--last night's file was only 12 blocks.  
 
At the suggestion of our systems programmer, I've put in a lot of query
statements.  I've issued QUERY LIMITS for the job submitter; it's only
used 84% of the allocation, with several thousand blocks available. The
SFS directory owner has only used 76% of its allocation, with several
thousand more blocks still available.  The filepool is not full.
 
I've issued QUERY FILEPOOL CONFLICT.  There is no conflict.
 
I've issued QUERY ACCESSED.  The directory shows that is accessed R/W.
 
When the write is unsuccessful, the program then loops through 5 tries
of releasing the access, reacquiring the access, and attempting to write
the file again.  This has never been successful.  I've issued both a
COPYFILE and a PIPE to try to write the file; these do not work once
there has been a failure.
 
We've looked at the operator consoles to see if we can find any jobs
running at the same time.  We haven't found any that are accessing that
directory structure.
 
There aren't any dumps to look at--it looks perfectly successful other
than the fact that it won't write the file.
 
Does anyone have any suggestions of something to try next?

 
Nora Graves
nora.e.gra...@irs.gov
 

Reply via email to