Morning! We have a scenario where I *think* the problem is a write cache issue, but I'm not 100% sure.
We have JobB dependent on JobA. JobA internally (ie, not via --output) writes three small files to nfs-shared disk, the first of which is then parsed by JobB - hence the dependency (using --dependency afterok: ) The error we are seeing is that JobA is reporting as successful finish, JobB is starting and failing because the file doesn't exist. In particular we are seeing this when JobA runs on NodeX but JobB runs on NodeY. The file does exist - it's being created ~0.6 seconds after JobB begins executing JobB's .out file stating "fail" creation time: 00:44:33.353336 JobA's .txt file creation time: 00:44:33.951973 I presume this is related to write cache buffers. What are the community's ideas re how is this best handled? Cheers L. ------ The most dangerous phrase in the language is, "We've always done it this way." - Grace Hopper