One quick and fairly easy
method to partially do this is to set up a simple script that does a basic query
(say against the schema which should be quick but not say a rootdse query) and
have a baseline acceptable time frame for the response. I have done this in the
past and found choked up GCs (specifically in relation to Exchange) using a
little perl and a little adfind.
Versus hardcoding GCs set up a dedicated
Exchange site. This protects you main site from Exchange and Exchange from
everything else. I.E. If Exchange tears down a DC, Exchange suffers. If
something else tears down a DC, Exchange should be fairly protected as it
shouldn't be a DC Exchange is using. ALSO and this is a point I have a
strong opinion of. Most GCs can go down and things don't care, authentication
will work, etc. Exchange GCs can't generally do this. This means that you
can keep certain GCs in mind for monitoring and your response to them going
offline. At the widget factory I worked for there were only a few GCs I cared
about going down in terms of speed to get them back up and running. The Exchange
GCs and the PDC's. The other DC's/GCs we cared about but we weren't running in
the middle of the night because of them.
Anyway, set up a script that you specify a
list of GCs or (better) takes a site or list of sites and then goes into a loop.
In the loop it gets a list of GCs or DCs, it then does a basic schema query that
will return some subset of objects and attributes. Unless you are going against
a GC across some slow wires, any query should be back in a second or less for an
idle DC. As you load up you will see 1,2,3,6,8 second responses. Once you hit
20+ seconds on a query, you really need to be looking at things. You get to 30
seconds and you most certainly have Exchange queue backups and probably store
hangs.
If you are monitoring this and you are
normally at 3-4 seconds at main load and you hit 10 seconds consistently on a
GC, then you page on that and start chasing.
joe
In our environment we have lots of GCs, most of which I don't
control.
While I run a dcdiag report each morning that checks the overall
health
of my domain including whether a DC is advertising itself as a GC,
we
see issues once in a while when a process does a GC discovery action
and
ends up on a "bad" one, e.g., not available, busy, slow network,
maybe
permissions, etc.
The other day our Exchange cluster was running
like a dog since after a
reboot, it hooked itself up with a GC that was not
performing
particularly well. As a solution for that particular
problem, we were
able to hardcode into the Exchange servers specific GCs that
I know work
well. Has anyone developed a script that checks on the
health of GC
functionality or dealt with this issue some other way?
Thanks in
advance!
Mike Thommes
List info : http://www.activedir.org/mail_list.htm
List
FAQ : http://www.activedir.org/list_faq.htm
List
archive: http://www.mail-archive.com/activedir%40mail.activedir.org/