Root cause analysis is essential, even after the quick fix.

 

-sc

 

From: Andrew S. Baker [mailto:asbz...@gmail.com] 
Sent: Thursday, September 23, 2010 9:22 AM
To: NT System Admin Issues
Subject: Re: Kick Ass Sysadmin (was RE: It appears that the Symantec
Virus has affected PGP already)

 

Definitely, balance is key.

 

My experience is that the more time you are able to spend educating
people while things are working, the more latitude you have to
troubleshoot while things are down.

 

I'm pretty sure we've all had to do a quick-n-dirty fix.  The problem
comes when you have so many of those in place, that are not getting
revisited for a more permanent solution, that you end up with a perfect
storm scenario.  Everyone loses there.


 

ASB (My XeeSM Profile) <http://XeeSM.com/AndrewBaker>  
Exploiting Technology for Business Advantage...
 





On Thu, Sep 23, 2010 at 7:48 AM, James Rankin <kz2...@googlemail.com>
wrote:

I agree, but sometimes you only have time to gather the facts after
you've implemented a fix for the users screaming at you. I personally
try to avoid server reboots to fix, that comes from being judged on
server uptime. I'm not saying "don't gather facts", I'm saying that
sometimes, in the support arena - and especially for small companies
and/or teams - implementing or finding a fix ends up being more urgent
than gathering facts, and sometimes you have to different try things to
narrow the issue down (not "at random" though). But I always believe in
understanding why something happened. Sometimes I will spend inordinate
amounts of time trying to understand a root cause, a lot more than I
ever get given to implement a fix - but you can afford to do that when
everyone isn't screaming at you.

I'm not saying that you shouldn't be systematic, just that sometimes you
have to act quickly and decisively - and occasionally almost
instinctively - using your previous experiences and knowledge as a
guide. YMMV, etc. - I'm not trying to start a flame war here.

When dealing with users also, I agree, it always pays to check and
double-check what issue they are actually experiencing. A lot gets lost
in translation and through whichever logging system you use.

 

On 23 September 2010 12:15, Mike Hoffman <m...@drumbrae.net> wrote:

The most important of these is gathering the facts. This is not what
then end user issue seems to be, but what it actually it. Then you can
decide to either fix, mitigate, or investigate further.

 

I know of a number of IT companies where a server reboot is the fix to
most issues, while I know that most issues are not affected by a reboot,
it only delays identifying the cause.

 

Mike

 

From: James Rankin [mailto:kz2...@googlemail.com] 
Sent: 23 September 2010 11:37


To: NT System Admin Issues

Subject: Re: Kick Ass Sysadmin (was RE: It appears that the Symantec
Virus has affected PGP already)

 

I wasn't saying "random" based on "gut feeling". It was more an inkling
that something was amiss with that particular function due to
experience. Maybe I should have been more clear about what I meant by
"didn't like the look of it". When a system is down and you're the only
one assigned to fix it, sometimes time is of the essence. In situations
where you have time on your side, a more structured approach is ideal.
Also, if you have an agreed SLA, you can be more considered in your
approach. Unfortunately that isn't always present though.

However I wasn't saying I would just stop services for the hell of it on
a live system that users were still able to access. That would just be
plain irresponsible.

On 23 September 2010 11:29, Ken Schaefer <k...@adopenstatic.com> wrote:

Agreed. Making random changes to servers based on "gut feelings" what
are bad, isn't my idea of a desirable troubleshooting strategy.

 

Gather facts 

Isolate Issue

Identify Root Cause

Implement Fix

 

Cheers

Ken

 

From: Andrew S. Baker [mailto:asbz...@gmail.com] 
Sent: Thursday, 23 September 2010 6:13 PM


To: NT System Admin Issues

Subject: Re: Kick Ass Sysadmin (was RE: It appears that the Symantec
Virus has affected PGP already)

 

Another aspect of troubleshooting is the ability to keep track of what
are actual facts, and what are as-yet-untested-assumptions.

 

This includes knowing how to classify information that has been given
you by the end user.


ASB (My XeeSM Profile) <http://XeeSM.com/AndrewBaker>  
Exploiting Technology for Business Advantage...
 

On Thu, Sep 23, 2010 at 2:42 AM, James Rankin <kz2...@googlemail.com>
wrote:

It's not what you Google, it's how you Google it. Even when interviewing
now I tend to try and look for people who can work problems out rather
than people who can simply rhyme off lists of stuff - and I'm always
keen on people who check the obvious things first. (Think "how would you
troubleshoot a GPO that's failing to apply" rather than "name the FSMO
roles".) There's an art to troubleshooting technical issues that's
sometimes hard to define. It's probably the old "clean minds and scruffy
minds" thing. Scruffy minds move in unexpected directions and try things
that wouldn't necessarily make sense. I can remember fixing some random
server hang just by stopping a service I didn't like the look of. It's
only afterwards that we realised that particular app was opening loads
of ports and generally monopolising the system. I didn't really know
what I was looking for, until I found it.

On 23 September 2010 00:31, Jonathan Link <jonathan.l...@gmail.com>
wrote:

        Sometimes I wonder if I'm just a good googler...  Seems like 90%
of my issues have been tackled (and documented!) by someone else.

        
        
         

        On Wed, Sep 22, 2010 at 7:17 PM, David Lum <david....@nwea.org>
wrote:

        The place with the ad you mean? I don't remember, but here's one
in NY that is not completely different:

        http://www.linkedin.com/jobs?viewJob=&jobId=1007553

         

        I do think I am generaly kick-ass, just don't call me an expert
at anything. My specialty is the near-vertical leanning curve that is
needed on an occcasional basis. I get stuff like this almost every
month:

        Q. "Hey Dave, is this possible?"

        -or-

        "Hey this infrastructure piece is down and the guy who usually
manages it is out and there's no documentation, can you make it work?"

         

        In both cases:

        A. "No clue..I mean in theory it is somehow possible" <run off>
<back in 45 minutes> "yeah we can do it, here's a script/tool/some other
clever capability".

         

        The answer of course sometimes comes from this list, or Exchange
list, or Michael B. Smith.

         

        Ok I'm not kick ass at all, but I know how to contact a LOT of
guys who are...

         

        Dave "my expertise is knowing experts and how to contact them"
Lum

        
________________________________


        From: Steven M. Caesare [scaes...@caesare.com]
        Sent: Wednesday, September 22, 2010 1:46 PM
        To: NT System Admin Issues
        Subject: RE: It appears that the Symantec Virus has affected PGP
already

        Hehe.. type of org?

         

        -sc

         

        From: David Lum [mailto:david....@nwea.org] 
        Sent: Wednesday, September 22, 2010 2:26 PM
        To: NT System Admin Issues
        Subject: RE: It appears that the Symantec Virus has affected PGP
already

         

        That reminds me, I was looking at job openings and once place
had the job description on their website "looking for someone who is
kick ass at finding technical solutions...". Being an informalish kind
of guy, I was tempted to apply just based on that kind of verbiage.

         

        Still like %dayjob% enough to not apply though...

         

        Dave

         

        From: Steven M. Caesare [mailto:scaes...@caesare.com] 
        Sent: Wednesday, September 22, 2010 10:16 AM
        To: NT System Admin Issues
        Subject: RE: It appears that the Symantec Virus has affected PGP
already

         

        I'm using that on my next technical evaluation summary.

         

        -sc

         

 

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here:
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to listmana...@lyris.sunbeltsoftware.com
with the body: unsubscribe ntsysadmin


~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here: 
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to listmana...@lyris.sunbeltsoftware.com
with the body: unsubscribe ntsysadmin

Reply via email to