For amdump, it is the planner that create the pid file, it must put its ppid in it (amdump process) (done in holding.c) For amflush, it is amflush that create the pid file, it must put its pid in it (done in Holding.pm) In both case, the driver call cleanup_holding to remove the pid, you are right, it should check for ppid
Jean-Louis On 21/11/17 08:13 PM, Nathan Stratton Treadway wrote: > On Mon, Nov 20, 2017 at 15:09:25 -0500, Jean-Louis Martineau wrote: > > perl automatically close it when it come out of scope. > > The close should before the 'if (kill' line. > > Okay, I moved the close() line up to that spot as I applied the patch. > > I can confirm that with your patch applied and my followup patch > applied, if I kick off amflush while amdump is running in the > background, it (amflush) prints out > Could not find any Amanda directories to flush. > and exits without attempting to flush anything.... but if I try again > after amdump finished, I'm able to complete the flush successfully. > > > A few followup points that came up during my testing: > > * when amflush completes now, the pid file is left out there in the > holding directory, thus preventing the directory from getting deleted: > > ===== > # ls -lR /amanda/TestBackup-holding/* > /amanda/TestBackup-holding/20171120172656: > total 4 > -rw------- 1 backup backup 5 Nov 21 18:50 pid > > /amanda/TestBackup-holding/20171121145009: > total 4 > -rw------- 1 backup backup 5 Nov 21 18:50 pid > ===== > > Off hand I don't see code in amflush or Amflush.pm > <http://Amflush.pm> > trying to clean up > the holding disk directories so I don't know if these "zombie" pid > files are related to the third issue (i.e. holding_cleanup_dir()), > but... > > * ...as a more general question I was wondering if it made sense > for programs that take a lock on a holding directory to have a way to > explicitly release that lock when they are done? (Probably this > would be by deleting the pid file, thought it could instead be > truncating it to zero length, or something.) > > The main reason I was wondering about this is that with the current > behavior of leaving a particular pid in the pid file forever, it > seems like there's a good chance of false positives when the lock is > checked by later jobs (i.e. some other, unrelated process using the > same pid when the lock-check happens). If there were some way for a > process to release the lock, false positives could only happen after a > process crashed. > > (I don't know how often these false positives would happen in > practice, but I can imagine it would cause a headache if the pid > locking a directory containing dumps needing to be flushed to vault > storage ended up getting reused by some long-runing process...) > > * I noticed that holding.c:holding_cleanup_dir() doesn't include parent > pid in the pid-file check. I haven't tracked back to figure out which > applications call the holding_cleanup* family of functions so I don't > know if that will cause actual problems, but I was wondering if that > function should instead just call call_take_holding() instead, so the > logic is all found in one place? > > > Nathan > > > > > ---------------------------------------------------------------------------- > Nathan Stratton Treadway - natha...@ontko.com - Mid-Atlantic region > Ray Ontko & Co. - Software consulting services - http://www.ontko.com/ > <http://www.ontko.com/> > GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt > <http://www.ontko.com/~nathanst/gpg_key.txt> > ID: 1023D/ECFB6239 > Key fingerprint = 6AD8 485E 20B9 5C71 231C 0C32 15F3 ADCD ECFB 6239 This message is the property of CARBONITE, INC. and may contain confidential or privileged information. If this message has been delivered to you by mistake, then do not copy or deliver this message to anyone. Instead, destroy it and notify me by reply e-mail