Dyer, Rodney wrote:
> Windows already has basic service watchdog'ing capability setup on the 
> Services.MSC, and can perform restarts, and task scripts (such as emails) on 
> total failure.
> 
> I believe that the latest versions of the Windows OpenAFS client are much 
> more reliable.  But in the past, this was very much not the case.  We wrote 
> our own scripts to watchdog the service, and of course our users found out 
> even earlier than our scripts did.  The problem is that even IF your service 
> is 100 percent bug free and reliable, the OS/hardware surrounding the service 
> may not be.  This can sometimes put your service into a hostile environment 
> in which it was not designed for.  You cannot always anticipate in your 
> software design logic for every contingency.  Services will die, and 
> sometimes they die because of erratic lower level return values from core OS 
> kernel dlls, or RAM problems, no matter how bold Microsofts claims to 
> reliability they claim.  Your expectations of return values are only what is 
> documented.
> 
> Rodney

I have certainly fixed lots of reasons why afsd_service.exe crashed.
The most recent one being the pioctl messaging snafu in which two pioctl
requests being executed at the same time could get the messages crossed.
The person that designed the interface made an incorrect assumption.
That combined with a failure to validate the message contents by the
receiver resulted in a situation in which either the afsd_service or
the caller (fs.exe, afscreds.exe, afslogon.dll,
netidmgr.exe/afscred.dll, etc.) could terminate unexpectedly.

Whenever I have found evidence of a crash I have pursued it until the
issue could be addressed.  What I'm asking is if there is something
that is resulting in crashes that I have not been made aware of.
If so, let me know and it will be fixed.

Jeffrey Altman

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to