glad Brett picked up on
analysing the different errors you were getting - I've not seen these
before.
curious to hear what type
of issue you are testing to recover from? From what you write, I gather you are testing to
restore your production domain to another (hopefully physically separated)
test-system. I.e. you are testing a full recovery of your AD domain
or forest - is this correct?
If so, authorititative
restore of the AD DB is not the right approach anyways. The restore database
option gives the false impression of doing a full recovery of AD - it bears more
risks than value and likely this is why it was removed from Longhorn. In a
distributed multi-master database such as AD, auth. restoring the partition of
one DC will never completely overwrite the same partition of the other DCs:
although you might be lucky and think you have fully recovered, any additional
objects or new attributes added to existing objects in the respective AD
partition after you performed the backup will replicate back to the restored
DC.
The correct way to fully
restore AD is to restore only a single
instance of the DB (i.e. a single DC) and re-build / re-promote all the other
DCs. Instead of performing an auth. restore of the DB, you'd just restore it
non-authoritatively and do a metadata cleanup of all the other DCs on
the restored DC to ensure it is the only one representing your domain (you would
mark SysVol as primary during the restore process). There are a few more steps
to perform to ensure that the recovered DC doesn't replicate any data from other
existing DCs in your environment - all of these are described in the (fairly
old) AD Forest Recovery Whitepaper which pretty much also applies for full
recovery of a single domain: http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=3EDA5A79-C99B-4DF9-823C-933FEBA08CFE
It's a little more
complex in a multi-domain environment as you also have to take care of the
partitions of your domain on GCs in other domains - if you're goal is to also
fully restore the config partition, you're talking about a full forest restore
anyways (which would roughtly use the same approach - restoring a single DC of
every domain - then re-promoting all other DCs).
Although LH backup and
recovery procedures are not fully finalized yet, for full AD recovery the
process would still roughly be the same as described above (mind you
there are big changes to the built-in backup-tool - and recovery of a DC to
different HW should now also be a valid option). The main change with LH
that will strongly influence time and risk for a domain/forest level restore is
the fact that you will not have as many writeable DCs in your environment.
Even if you are strongly distributed geographically, the goal will be to only
host writeable DCs in your datacenters and make all the other DCs in your
environment read-only. As the name implies, the Read-Only DCs (RODC) do
not allow any originating writes on them and will never replicate anything back
to a writeable DC - this way there is less work involved to ensure a consistent
status of AD during the recovery. Not saying you won't also have to
re-promote the RODCs, but you certainly have less writable DCs to worry about
and can possible leave the RODCs running during the recovery process (we'll have
to see about this). I have good hopes that this will increase the overall
recovery-speed of AD in large distributed deployments.
/Guido
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joshua Coffman
Sent: Dienstag, 20. Juni 2006 21:33
To: ActiveDir@mail.activedir.org
Subject: RE: Re: [ActiveDir] Errors During Authoritative Restore
I appreciate your assistance on this.
Yes, there are tons of schema mods.
In the domain throwing the majority of the errors, these mods were performed using an LDIF file, during the installation of a 3rd party Identity Management Application.
I do not know if there have been LDAP naming attributes added or not. If you can send a query to verify, I would be happy to run it.
I knew that Restore Database is the "last resort" method, but that is what we wanted to test. We do have multiple DCs replicating across multiple geographic sites, so this scenario is unlikely, unless there were some sort of catastrophic corruption that took place.
In the future, if "restore database" is unavailable, what will be used in its place if you need to do a bare metal authoritative restore of the entire AD?
It will take a while to run the tools you requested against the AD, because it is a production system. I cannot run them directly in the PROD environment, so I would have to pull a mirrored drive from the prod DC, and pop it into an offline server. This could take a while for the required approvals.
Thanks again for your help!
Josh
> Date: Tue, 20 Jun 2006 10:09:58 -0700
> From: [EMAIL PROTECTED]
> To: ActiveDir@mail.activedir.org
> Subject: Re: [ActiveDir] Errors During Authoritative Restore
>
> Do you have any schema extensions applied? Do you know if those schemas
> added any LDAP naming attributes? If the 2nd question doesn't make sense
> to you, I'll figure out a way you can query this, and send it to us.
>
> Aside, it is generally not recommended to run "restore database". In fact
> this command was removed from Longhorn.
>
> If you decide to retry that scenario again, I can suggest some
> intermediate steps that would be good to know. i.e.
>
> 1. Before running auth restore, be interesting to know the results of an
> esentutl /k ntds.dit (checksum the database).
>
> 2. After auth restore, it would be good to know if the database is
> logically consistent from ESE's perspective (do this via "esentutl /g
> ntds.dit").
>
> 3. Also after we know it is logically consistent from AD's perspective (do
> this via, exact command line provided:
> ntdsutil "sem data anal" "go" "q" "q"
>
> Cheers,
> BrettSh [msft]
> Ex-Building 7 Garage Door Operator
>
>
> On Tue, 20 Jun 2006, Joshua Coffman wrote:
>
> > I have a few questions for you AD gurus out there! :)
> >
> > I just ran through a Disaster Recovery test of two of our ADs and I
> > have a few questions which have come up as a result of the test.
> >
> > Configuration Notes:
> > These boxes are Windows 2003, SP1.
> > The domains were originally Windows 2000 domains.
> >
> > The following errors pop up on one of the domain controllers during
> > the restore.
> >
> > "Could not display the attribute type for the object with DNT
> > 831424.Error: failed to get dn of dnt 831424" This occurs many times
> > throughout the restore.
> >
> > NOTE: This is during a complete restore, e.g. "authoritative restore:
> > restore database" I also see a few of these.
> >
> > "There was an error parsing the GUID from the file on line: 1981" (Not
> > to many of these, maybe four or five)
> >
> > Additionally, with SP1, LDIF files are created to restore back-links.
> > The file that restores the user/group back-links imports successfully.
> > The file that restores the configuration back-links fails. (sorry, I
> > do not have the error handy)
> >
> > The authoritative restore says it completed successfully, and after I
> > go through metadata cleanup and FSMO seizure, the box starts up
> > without any errors, and AD throws no errors on startup.
> >
> > I was wondering if anyone can tell me what these errors mean? What
> > are their ramifications? How can the errors be resolved.
> >
> > Thanks,
> >
> > Josh
>
> List info : http://www.activedir.org/List.aspx
> List FAQ : http://www.activedir.org/ListFAQ.aspx
> List archive: http://www.activedir.org/ml/threads.aspx