Root cause analysis is essential, even after the quick fix.
-sc From: Andrew S. Baker [mailto:asbz...@gmail.com] Sent: Thursday, September 23, 2010 9:22 AM To: NT System Admin Issues Subject: Re: Kick Ass Sysadmin (was RE: It appears that the Symantec Virus has affected PGP already) Definitely, balance is key. My experience is that the more time you are able to spend educating people while things are working, the more latitude you have to troubleshoot while things are down. I'm pretty sure we've all had to do a quick-n-dirty fix. The problem comes when you have so many of those in place, that are not getting revisited for a more permanent solution, that you end up with a perfect storm scenario. Everyone loses there. ASB (My XeeSM Profile) <http://XeeSM.com/AndrewBaker> Exploiting Technology for Business Advantage... On Thu, Sep 23, 2010 at 7:48 AM, James Rankin <kz2...@googlemail.com> wrote: I agree, but sometimes you only have time to gather the facts after you've implemented a fix for the users screaming at you. I personally try to avoid server reboots to fix, that comes from being judged on server uptime. I'm not saying "don't gather facts", I'm saying that sometimes, in the support arena - and especially for small companies and/or teams - implementing or finding a fix ends up being more urgent than gathering facts, and sometimes you have to different try things to narrow the issue down (not "at random" though). But I always believe in understanding why something happened. Sometimes I will spend inordinate amounts of time trying to understand a root cause, a lot more than I ever get given to implement a fix - but you can afford to do that when everyone isn't screaming at you. I'm not saying that you shouldn't be systematic, just that sometimes you have to act quickly and decisively - and occasionally almost instinctively - using your previous experiences and knowledge as a guide. YMMV, etc. - I'm not trying to start a flame war here. When dealing with users also, I agree, it always pays to check and double-check what issue they are actually experiencing. A lot gets lost in translation and through whichever logging system you use. On 23 September 2010 12:15, Mike Hoffman <m...@drumbrae.net> wrote: The most important of these is gathering the facts. This is not what then end user issue seems to be, but what it actually it. Then you can decide to either fix, mitigate, or investigate further. I know of a number of IT companies where a server reboot is the fix to most issues, while I know that most issues are not affected by a reboot, it only delays identifying the cause. Mike From: James Rankin [mailto:kz2...@googlemail.com] Sent: 23 September 2010 11:37 To: NT System Admin Issues Subject: Re: Kick Ass Sysadmin (was RE: It appears that the Symantec Virus has affected PGP already) I wasn't saying "random" based on "gut feeling". It was more an inkling that something was amiss with that particular function due to experience. Maybe I should have been more clear about what I meant by "didn't like the look of it". When a system is down and you're the only one assigned to fix it, sometimes time is of the essence. In situations where you have time on your side, a more structured approach is ideal. Also, if you have an agreed SLA, you can be more considered in your approach. Unfortunately that isn't always present though. However I wasn't saying I would just stop services for the hell of it on a live system that users were still able to access. That would just be plain irresponsible. On 23 September 2010 11:29, Ken Schaefer <k...@adopenstatic.com> wrote: Agreed. Making random changes to servers based on "gut feelings" what are bad, isn't my idea of a desirable troubleshooting strategy. Gather facts Isolate Issue Identify Root Cause Implement Fix Cheers Ken From: Andrew S. Baker [mailto:asbz...@gmail.com] Sent: Thursday, 23 September 2010 6:13 PM To: NT System Admin Issues Subject: Re: Kick Ass Sysadmin (was RE: It appears that the Symantec Virus has affected PGP already) Another aspect of troubleshooting is the ability to keep track of what are actual facts, and what are as-yet-untested-assumptions. This includes knowing how to classify information that has been given you by the end user. ASB (My XeeSM Profile) <http://XeeSM.com/AndrewBaker> Exploiting Technology for Business Advantage... On Thu, Sep 23, 2010 at 2:42 AM, James Rankin <kz2...@googlemail.com> wrote: It's not what you Google, it's how you Google it. Even when interviewing now I tend to try and look for people who can work problems out rather than people who can simply rhyme off lists of stuff - and I'm always keen on people who check the obvious things first. (Think "how would you troubleshoot a GPO that's failing to apply" rather than "name the FSMO roles".) There's an art to troubleshooting technical issues that's sometimes hard to define. It's probably the old "clean minds and scruffy minds" thing. Scruffy minds move in unexpected directions and try things that wouldn't necessarily make sense. I can remember fixing some random server hang just by stopping a service I didn't like the look of. It's only afterwards that we realised that particular app was opening loads of ports and generally monopolising the system. I didn't really know what I was looking for, until I found it. On 23 September 2010 00:31, Jonathan Link <jonathan.l...@gmail.com> wrote: Sometimes I wonder if I'm just a good googler... Seems like 90% of my issues have been tackled (and documented!) by someone else. On Wed, Sep 22, 2010 at 7:17 PM, David Lum <david....@nwea.org> wrote: The place with the ad you mean? I don't remember, but here's one in NY that is not completely different: http://www.linkedin.com/jobs?viewJob=&jobId=1007553 I do think I am generaly kick-ass, just don't call me an expert at anything. My specialty is the near-vertical leanning curve that is needed on an occcasional basis. I get stuff like this almost every month: Q. "Hey Dave, is this possible?" -or- "Hey this infrastructure piece is down and the guy who usually manages it is out and there's no documentation, can you make it work?" In both cases: A. "No clue..I mean in theory it is somehow possible" <run off> <back in 45 minutes> "yeah we can do it, here's a script/tool/some other clever capability". The answer of course sometimes comes from this list, or Exchange list, or Michael B. Smith. Ok I'm not kick ass at all, but I know how to contact a LOT of guys who are... Dave "my expertise is knowing experts and how to contact them" Lum ________________________________ From: Steven M. Caesare [scaes...@caesare.com] Sent: Wednesday, September 22, 2010 1:46 PM To: NT System Admin Issues Subject: RE: It appears that the Symantec Virus has affected PGP already Hehe.. type of org? -sc From: David Lum [mailto:david....@nwea.org] Sent: Wednesday, September 22, 2010 2:26 PM To: NT System Admin Issues Subject: RE: It appears that the Symantec Virus has affected PGP already That reminds me, I was looking at job openings and once place had the job description on their website "looking for someone who is kick ass at finding technical solutions...". Being an informalish kind of guy, I was tempted to apply just based on that kind of verbiage. Still like %dayjob% enough to not apply though... Dave From: Steven M. Caesare [mailto:scaes...@caesare.com] Sent: Wednesday, September 22, 2010 10:16 AM To: NT System Admin Issues Subject: RE: It appears that the Symantec Virus has affected PGP already I'm using that on my next technical evaluation summary. -sc ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ --- To manage subscriptions click here: http://lyris.sunbelt-software.com/read/my_forums/ or send an email to listmana...@lyris.sunbeltsoftware.com with the body: unsubscribe ntsysadmin ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ --- To manage subscriptions click here: http://lyris.sunbelt-software.com/read/my_forums/ or send an email to listmana...@lyris.sunbeltsoftware.com with the body: unsubscribe ntsysadmin