RE: [ActiveDir] Database Corruption

2005-08-26 Thread Alex Fontana
Thanks Brett!

I will definitely make a copy of the ntds directory before any changes.  I
also plan to do a full hardware check before defragging/restoring.  Thanks.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley
Sent: Friday, August 26, 2005 6:15 AM
To: ActiveDir@mail.activedir.org
Subject: RE: [ActiveDir] Database Corruption

Alex,

Unfortunately, only the developer version of eseutil.exe gives out more
info, including a raw hex dump of the page.  I'm a little curious, to see
if the tail of 81183, and the head of 81184 look skewed, sometimes we've
seen a disk corruption, where the bytes seem right, just off by several
bytes ... but maybe a probably explanation will present itself by just the
output of the header ...

If you make a copy of the bad database (& logs), before you defrag or
restore, it gives you / us the chance to ask more questions about the
nature of the corruption later ...

Cheers,
BrettSh [msft]

> This posting is provided "AS IS" with no warranties, and confers no
> rights.


On Tue, 23 Aug 2005, Al Mulnick wrote:

> Hopefully it's just an index that's taken one for the team.
>  Take the advice and ensure that the hardware is solid before
> declaring things well enough to be restored etc. This was the type of
> error in the Exchange world that would bug you till the end.  It was
> associated with everything from disk controller settings (battery
> backup) to faulty disks, to transient hardware errors.  Tough to
> diagnose, but almost always a hardware error (like >99% of the time)
> was the root cause. Software issues were sometimes to blame
> (misonfigured AV etc) that would take things out but see above for the
> frequency of that.
>  The fact that it stays the same is a good thing.  The fact that it
> occurred at all is not. Disk or other hardware would be my next
> suspect.  All the way down to the motherboard (checked the revs to
> ensure no issues yet?)
>  I have to also admit that a restore is not my favorite method if the
> bandwidth can support it.  I'd prefer to dcpromo the repaired piece of
> hardware, especially for a smaller DIT. That's just my preference
> though.
>  Good luck,
>  
> Al
> 
> 
> 
> From: [EMAIL PROTECTED] on behalf of Alex Fontana
> Sent: Mon 8/22/2005 9:30 PM
> To: ActiveDir@mail.activedir.org
> Subject: RE: [ActiveDir] Database Corruption
> 
> 
> 
> ECC memory, no errors in the event logs relating to memory.  The ntds.dit
is
> about 800MB.  There are multiple events, the page number is always the same
> (81184).
> 
> Haven't fixed it yet - it's limping along until this weekend when I'll dump
> the pages to see what the header shows - then either defrag or restore...
> 
> -Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley
> Sent: Monday, August 22, 2005 10:22 AM
> To: ActiveDir@mail.activedir.org
> Subject: RE: [ActiveDir] Database Corruption
> 
> Both Steve, Hunter's, and your original advice is sound ... I think it is
> very likely if you call PSS, they'll tell you to do Steve's, yours, and
> Hunter's advice in about that order.
> 
> My favorite disk sub-system diagnostics is jetstress, but dedicated disk
> sub-system stressers are better, as they try odd patterns of bits that
> they know buses, electrical systems, and disks get fouled up on.  Also do
> not ignore RAM checkers, that is almost as likely, perhaps even more
> likely here.
> 
> Do you have ECC or parity memory?  Any events in system or app event log
> related to parity memory issues?
> 
> BTW, how big is your ntds.dit file?  Is it over 1.5-2.5 GBs?  That
> increases the hypothesis of memory issues.
> 
> So you have multiple of these events?  If you do, do they always happen
> for the same page numbers ("pgno") and offsets?  If different, does thier
> frequency increase?
> 
> If you haven't restored it already, I'd be curious if you felt like
> sharing, what the page looked like from:
>esentutl /m ntds.dit /p81184 /v
>  ... then we could see how bad the header was corrupted.  Also this will
> tell you if the page is an "Index page", and thus likely to be fixed by an
> offline defrag.  If you see "primary" or "long value" page, offline defrag
> probably won't fix it.
> 
> Also get the previous page too (change 81184 to 81183 in the above
> command).  But again, only if you feel like sharing.
> 
> Cheers,
> BrettSh
> 
> This posting is provided "AS IS" with no warranties, and confers no
> rights.
> 
> 
> 
> On Sat, 20 Aug 2005, Cole

RE: [ActiveDir] Database Corruption

2005-08-26 Thread Brett Shirley
Alex,

Unfortunately, only the developer version of eseutil.exe gives out more
info, including a raw hex dump of the page.  I'm a little curious, to see
if the tail of 81183, and the head of 81184 look skewed, sometimes we've
seen a disk corruption, where the bytes seem right, just off by several
bytes ... but maybe a probably explanation will present itself by just the
output of the header ...

If you make a copy of the bad database (& logs), before you defrag or
restore, it gives you / us the chance to ask more questions about the
nature of the corruption later ...

Cheers,
BrettSh [msft]

> This posting is provided "AS IS" with no warranties, and confers no
> rights.


On Tue, 23 Aug 2005, Al Mulnick wrote:

> Hopefully it's just an index that's taken one for the team.
>  Take the advice and ensure that the hardware is solid before
> declaring things well enough to be restored etc. This was the type of
> error in the Exchange world that would bug you till the end.  It was
> associated with everything from disk controller settings (battery
> backup) to faulty disks, to transient hardware errors.  Tough to
> diagnose, but almost always a hardware error (like >99% of the time)
> was the root cause. Software issues were sometimes to blame
> (misonfigured AV etc) that would take things out but see above for the
> frequency of that.
>  The fact that it stays the same is a good thing.  The fact that it
> occurred at all is not. Disk or other hardware would be my next
> suspect.  All the way down to the motherboard (checked the revs to
> ensure no issues yet?)
>  I have to also admit that a restore is not my favorite method if the
> bandwidth can support it.  I'd prefer to dcpromo the repaired piece of
> hardware, especially for a smaller DIT. That's just my preference
> though.
>  Good luck,
>  
> Al
> 
> 
> 
> From: [EMAIL PROTECTED] on behalf of Alex Fontana
> Sent: Mon 8/22/2005 9:30 PM
> To: ActiveDir@mail.activedir.org
> Subject: RE: [ActiveDir] Database Corruption
> 
> 
> 
> ECC memory, no errors in the event logs relating to memory.  The ntds.dit is
> about 800MB.  There are multiple events, the page number is always the same
> (81184).
> 
> Haven't fixed it yet - it's limping along until this weekend when I'll dump
> the pages to see what the header shows - then either defrag or restore...
> 
> -Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley
> Sent: Monday, August 22, 2005 10:22 AM
> To: ActiveDir@mail.activedir.org
> Subject: RE: [ActiveDir] Database Corruption
> 
> Both Steve, Hunter's, and your original advice is sound ... I think it is
> very likely if you call PSS, they'll tell you to do Steve's, yours, and
> Hunter's advice in about that order.
> 
> My favorite disk sub-system diagnostics is jetstress, but dedicated disk
> sub-system stressers are better, as they try odd patterns of bits that
> they know buses, electrical systems, and disks get fouled up on.  Also do
> not ignore RAM checkers, that is almost as likely, perhaps even more
> likely here.
> 
> Do you have ECC or parity memory?  Any events in system or app event log
> related to parity memory issues?
> 
> BTW, how big is your ntds.dit file?  Is it over 1.5-2.5 GBs?  That
> increases the hypothesis of memory issues.
> 
> So you have multiple of these events?  If you do, do they always happen
> for the same page numbers ("pgno") and offsets?  If different, does thier
> frequency increase?
> 
> If you haven't restored it already, I'd be curious if you felt like
> sharing, what the page looked like from:
>esentutl /m ntds.dit /p81184 /v
>  ... then we could see how bad the header was corrupted.  Also this will
> tell you if the page is an "Index page", and thus likely to be fixed by an
> offline defrag.  If you see "primary" or "long value" page, offline defrag
> probably won't fix it.
> 
> Also get the previous page too (change 81184 to 81183 in the above
> command).  But again, only if you feel like sharing.
> 
> Cheers,
> BrettSh
> 
> This posting is provided "AS IS" with no warranties, and confers no
> rights.
> 
> 
> 
> On Sat, 20 Aug 2005, Coleman, Hunter wrote:
> 
> > I'd also look at running hardware diagnostics, particularly on the
> > disk subsystem and controller. No point in restoring or repromoting if
> > there is an unresolved hardware problem.
> >
> >   -Original Message-
> >   From: [EMAIL PROTECTED] on behalf of Steve Linehan
> >  

RE: [ActiveDir] Database Corruption

2005-08-23 Thread Al Mulnick
Hopefully it's just an index that's taken one for the team.  
 
Take the advice and ensure that the hardware is solid before declaring things 
well enough to be restored etc. This was the type of error in the Exchange 
world that would bug you till the end.  It was associated with everything from 
disk controller settings (battery backup) to faulty disks, to transient 
hardware errors.  Tough to diagnose, but almost always a hardware error (like 
>99% of the time) was the root cause. Software issues were sometimes to blame 
(misonfigured AV etc) that would take things out but see above for the 
frequency of that.
 
The fact that it stays the same is a good thing.  The fact that it occurred at 
all is not. Disk or other hardware would be my next suspect.  All the way down 
to the motherboard (checked the revs to ensure no issues yet?)
 
I have to also admit that a restore is not my favorite method if the bandwidth 
can support it.  I'd prefer to dcpromo the repaired piece of hardware, 
especially for a smaller DIT. That's just my preference though. 
 
Good luck,
 
Al



From: [EMAIL PROTECTED] on behalf of Alex Fontana
Sent: Mon 8/22/2005 9:30 PM
To: ActiveDir@mail.activedir.org
Subject: RE: [ActiveDir] Database Corruption



ECC memory, no errors in the event logs relating to memory.  The ntds.dit is
about 800MB.  There are multiple events, the page number is always the same
(81184).

Haven't fixed it yet - it's limping along until this weekend when I'll dump
the pages to see what the header shows - then either defrag or restore...

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley
Sent: Monday, August 22, 2005 10:22 AM
To: ActiveDir@mail.activedir.org
Subject: RE: [ActiveDir] Database Corruption

Both Steve, Hunter's, and your original advice is sound ... I think it is
very likely if you call PSS, they'll tell you to do Steve's, yours, and
Hunter's advice in about that order.

My favorite disk sub-system diagnostics is jetstress, but dedicated disk
sub-system stressers are better, as they try odd patterns of bits that
they know buses, electrical systems, and disks get fouled up on.  Also do
not ignore RAM checkers, that is almost as likely, perhaps even more
likely here.

Do you have ECC or parity memory?  Any events in system or app event log
related to parity memory issues?

BTW, how big is your ntds.dit file?  Is it over 1.5-2.5 GBs?  That
increases the hypothesis of memory issues.

So you have multiple of these events?  If you do, do they always happen
for the same page numbers ("pgno") and offsets?  If different, does thier
frequency increase?

If you haven't restored it already, I'd be curious if you felt like
sharing, what the page looked like from:
   esentutl /m ntds.dit /p81184 /v
 ... then we could see how bad the header was corrupted.  Also this will
tell you if the page is an "Index page", and thus likely to be fixed by an
offline defrag.  If you see "primary" or "long value" page, offline defrag
probably won't fix it.

Also get the previous page too (change 81184 to 81183 in the above
command).  But again, only if you feel like sharing.

Cheers,
BrettSh

This posting is provided "AS IS" with no warranties, and confers no
rights.



On Sat, 20 Aug 2005, Coleman, Hunter wrote:

> I'd also look at running hardware diagnostics, particularly on the
> disk subsystem and controller. No point in restoring or repromoting if
> there is an unresolved hardware problem.
>
>   -Original Message-
>   From: [EMAIL PROTECTED] on behalf of Steve Linehan
>   Sent: Fri 8/19/2005 8:18 PM
>   To: ActiveDir@mail.activedir.org
>   Cc:
>   Subject: RE: [ActiveDir] Database Corruption
>
>   Well the first thing I always recommend is to try an offline
> defrag as it is possible that the corruption is in an index, i.e.
> metadata, that can be rebuilt.  If the offline defrag fails then
> restoring from backup or repromoting will be your next step.
>
>   Thanks,
>   -Steve
>   _ 
>
>   From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ayers, Diane
>   Sent: Friday, August 19, 2005 6:43 PM
>   To: ActiveDir@mail.activedir.org
>   Subject: RE: [ActiveDir] Database Corruption
>
>   My preferred approach would be to demote the box to member
> server and re-promote to a domain controller to ensure a good fresh
> copy of the DIT.  YMMV as the specific requirements at your location
> may prevent this.  We have only run into this once early in our AD
> days and this was the approach we used with good success.
>
>   Diane
>   _ 
>
>   From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Alex Fontana
>   S

RE: [ActiveDir] Database Corruption

2005-08-22 Thread Alex Fontana
ECC memory, no errors in the event logs relating to memory.  The ntds.dit is
about 800MB.  There are multiple events, the page number is always the same
(81184).

Haven't fixed it yet - it's limping along until this weekend when I'll dump
the pages to see what the header shows - then either defrag or restore...

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley
Sent: Monday, August 22, 2005 10:22 AM
To: ActiveDir@mail.activedir.org
Subject: RE: [ActiveDir] Database Corruption

Both Steve, Hunter's, and your original advice is sound ... I think it is
very likely if you call PSS, they'll tell you to do Steve's, yours, and
Hunter's advice in about that order.

My favorite disk sub-system diagnostics is jetstress, but dedicated disk
sub-system stressers are better, as they try odd patterns of bits that
they know buses, electrical systems, and disks get fouled up on.  Also do
not ignore RAM checkers, that is almost as likely, perhaps even more
likely here.

Do you have ECC or parity memory?  Any events in system or app event log
related to parity memory issues?

BTW, how big is your ntds.dit file?  Is it over 1.5-2.5 GBs?  That
increases the hypothesis of memory issues.

So you have multiple of these events?  If you do, do they always happen
for the same page numbers ("pgno") and offsets?  If different, does thier
frequency increase?

If you haven't restored it already, I'd be curious if you felt like
sharing, what the page looked like from:
   esentutl /m ntds.dit /p81184 /v
 ... then we could see how bad the header was corrupted.  Also this will
tell you if the page is an "Index page", and thus likely to be fixed by an
offline defrag.  If you see "primary" or "long value" page, offline defrag
probably won't fix it.

Also get the previous page too (change 81184 to 81183 in the above
command).  But again, only if you feel like sharing.

Cheers,
BrettSh

This posting is provided "AS IS" with no warranties, and confers no
rights.



On Sat, 20 Aug 2005, Coleman, Hunter wrote:

> I'd also look at running hardware diagnostics, particularly on the
> disk subsystem and controller. No point in restoring or repromoting if
> there is an unresolved hardware problem.
> 
>   -Original Message- 
>   From: [EMAIL PROTECTED] on behalf of Steve Linehan 
>   Sent: Fri 8/19/2005 8:18 PM 
>   To: ActiveDir@mail.activedir.org 
>   Cc: 
>   Subject: RE: [ActiveDir] Database Corruption
> 
>   Well the first thing I always recommend is to try an offline
> defrag as it is possible that the corruption is in an index, i.e.
> metadata, that can be rebuilt.  If the offline defrag fails then
> restoring from backup or repromoting will be your next step.
> 
>   Thanks,
>   -Steve
>   _  
> 
>   From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ayers, Diane
>   Sent: Friday, August 19, 2005 6:43 PM
>   To: ActiveDir@mail.activedir.org
>   Subject: RE: [ActiveDir] Database Corruption
>
>   My preferred approach would be to demote the box to member
> server and re-promote to a domain controller to ensure a good fresh
> copy of the DIT.  YMMV as the specific requirements at your location
> may prevent this.  We have only run into this once early in our AD
> days and this was the approach we used with good success.
> 
>   Diane
>   _  
> 
>   From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Alex Fontana
>   Sent: Friday, August 19, 2005 3:29 PM
>   To: ActiveDir@mail.activedir.org
>   Subject: [ActiveDir] Database Corruption
> 
>   Started getting the error below a few weeks ago on one of our
> DCs.  My first reaction is to run a non-auth restore from a day before
> this started happening and let replication take care of everything
> else.  Any reason NOT to do this?  I’m concerned that this may
> happen again and wasn’t able to find anything specific to the error
> below.  Besides calling PSS any thing else I should look into before
> restoring?  This box holds all FSMO roles, Win2k3, server for NIS.
> 
>   TIA
>   -alex
>
> 
>   Event Type:   Error
>   Event Source:NTDS ISAM
>   Event Category: Database Page Cache 
>   Event ID:   475
>   Date:8/19/2005
>   Time:2:00:24 PM
>   User:N/A
>   Computer: DC
>   Description:
> 
>   NTDS (528) NTDSA: The database page read from the file
> "C:\WINNT\NTDS\ntds.dit" at offset 665067520 (0x27a42000) for
> 8192 (0x2000) bytes failed verification due to a page number
> mism

RE: [ActiveDir] Database Corruption

2005-08-22 Thread Brett Shirley
Both Steve, Hunter's, and your original advice is sound ... I think it is
very likely if you call PSS, they'll tell you to do Steve's, yours, and
Hunter's advice in about that order.

My favorite disk sub-system diagnostics is jetstress, but dedicated disk
sub-system stressers are better, as they try odd patterns of bits that
they know buses, electrical systems, and disks get fouled up on.  Also do
not ignore RAM checkers, that is almost as likely, perhaps even more
likely here.

Do you have ECC or parity memory?  Any events in system or app event log
related to parity memory issues?

BTW, how big is your ntds.dit file?  Is it over 1.5-2.5 GBs?  That
increases the hypothesis of memory issues.

So you have multiple of these events?  If you do, do they always happen
for the same page numbers ("pgno") and offsets?  If different, does thier
frequency increase?

If you haven't restored it already, I'd be curious if you felt like
sharing, what the page looked like from:
   esentutl /m ntds.dit /p81184 /v
 ... then we could see how bad the header was corrupted.  Also this will
tell you if the page is an "Index page", and thus likely to be fixed by an
offline defrag.  If you see "primary" or "long value" page, offline defrag
probably won't fix it.

Also get the previous page too (change 81184 to 81183 in the above
command).  But again, only if you feel like sharing.

Cheers,
BrettSh

This posting is provided "AS IS" with no warranties, and confers no
rights.



On Sat, 20 Aug 2005, Coleman, Hunter wrote:

> I'd also look at running hardware diagnostics, particularly on the
> disk subsystem and controller. No point in restoring or repromoting if
> there is an unresolved hardware problem.
> 
>   -Original Message- 
>   From: [EMAIL PROTECTED] on behalf of Steve Linehan 
>   Sent: Fri 8/19/2005 8:18 PM 
>   To: ActiveDir@mail.activedir.org 
>   Cc: 
>   Subject: RE: [ActiveDir] Database Corruption
> 
>   Well the first thing I always recommend is to try an offline
> defrag as it is possible that the corruption is in an index, i.e.
> metadata, that can be rebuilt.  If the offline defrag fails then
> restoring from backup or repromoting will be your next step.
> 
>   Thanks,
>   -Steve
>   _  
> 
>   From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ayers, 
> Diane
>   Sent: Friday, August 19, 2005 6:43 PM
>   To: ActiveDir@mail.activedir.org
>   Subject: RE: [ActiveDir] Database Corruption
>
>   My preferred approach would be to demote the box to member
> server and re-promote to a domain controller to ensure a good fresh
> copy of the DIT.  YMMV as the specific requirements at your location
> may prevent this.  We have only run into this once early in our AD
> days and this was the approach we used with good success.
> 
>   Diane
>   _  
> 
>       From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex 
> Fontana
>   Sent: Friday, August 19, 2005 3:29 PM
>   To: ActiveDir@mail.activedir.org
>   Subject: [ActiveDir] Database Corruption
> 
>   Started getting the error below a few weeks ago on one of our
> DCs.  My first reaction is to run a non-auth restore from a day before
> this started happening and let replication take care of everything
> else.  Any reason NOT to do this?  I???m concerned that this may
> happen again and wasn???t able to find anything specific to the error
> below.  Besides calling PSS any thing else I should look into before
> restoring?  This box holds all FSMO roles, Win2k3, server for NIS.
> 
>   TIA
>   -alex
>
> 
>   Event Type:   Error
>   Event Source:NTDS ISAM
>   Event Category: Database Page Cache 
>   Event ID:   475
>   Date:8/19/2005
>   Time:2:00:24 PM
>   User:N/A
>   Computer: DC
>   Description:
> 
>   NTDS (528) NTDSA: The database page read from the file
> "C:\WINNT\NTDS\ntds.dit" at offset 665067520 (0x27a42000) for
> 8192 (0x2000) bytes failed verification due to a page number
> mismatch.  The expected page number was 81184 (0x00013d20) and the
> actual page number was 2349964126 (0x8c119b5e).  The read operation
> will fail with error -1018 (0xfc06).  If this condition persists
> then please restore the database from a previous backup. This problem
> is likely due to faulty hardware. Please contact your hardware vendor
> for further assistance diagnosing the problem.
> 
>
> 
> 

List info   : http://www.activedir.org/List.aspx
List FAQ: http://www.activedir.org/ListFAQ.aspx
List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/


RE: [ActiveDir] Database Corruption

2005-08-20 Thread Coleman, Hunter
I'd also look at running hardware diagnostics, particularly on the disk 
subsystem and controller. No point in restoring or repromoting if there is an 
unresolved hardware problem.

-Original Message- 
From: [EMAIL PROTECTED] on behalf of Steve Linehan 
Sent: Fri 8/19/2005 8:18 PM 
To: ActiveDir@mail.activedir.org 
Cc: 
Subject: RE: [ActiveDir] Database Corruption



Well the first thing I always recommend is to try an offline defrag as 
it is possible that the corruption is in an index, i.e. metadata, that can be 
rebuilt.  If the offline defrag fails then restoring from backup or repromoting 
will be your next step.

 

Thanks,

 

-Steve

 


  _  


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ayers, 
Diane
Sent: Friday, August 19, 2005 6:43 PM
To: ActiveDir@mail.activedir.org
Subject: RE: [ActiveDir] Database Corruption

 

My preferred approach would be to demote the box to member server and 
re-promote to a domain controller to ensure a good fresh copy of the DIT.  YMMV 
as the specific requirements at your location may prevent this.  We have only 
run into this once early in our AD days and this was the approach we used with 
good success.

 

Diane

 


  _  


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex 
Fontana
Sent: Friday, August 19, 2005 3:29 PM
To: ActiveDir@mail.activedir.org
Subject: [ActiveDir] Database Corruption

Started getting the error below a few weeks ago on one of our DCs.  My 
first reaction is to run a non-auth restore from a day before this started 
happening and let replication take care of everything else.  Any reason NOT to 
do this?  I’m concerned that this may happen again and wasn’t able to find 
anything specific to the error below.  Besides calling PSS any thing else I 
should look into before restoring?  This box holds all FSMO roles, Win2k3, 
server for NIS.

 

TIA

 

-alex

 

Event Type:   Error

Event Source:NTDS ISAM

Event Category: Database Page Cache 

Event ID:   475

Date:8/19/2005

Time:2:00:24 PM

User:N/A

Computer: DC

Description:

NTDS (528) NTDSA: The database page read from the file 
"C:\WINNT\NTDS\ntds.dit" at offset 665067520 (0x27a42000) for 8192 
(0x2000) bytes failed verification due to a page number mismatch.  The 
expected page number was 81184 (0x00013d20) and the actual page number was 
2349964126 (0x8c119b5e).  The read operation will fail with error -1018 
(0xfc06).  If this condition persists then please restore the database from 
a previous backup. This problem is likely due to faulty hardware. Please 
contact your hardware vendor for further assistance diagnosing the problem.

 

<>

RE: [ActiveDir] Database Corruption

2005-08-19 Thread Steve Linehan








Well the first thing I always recommend is
to try an offline defrag as it is possible that the corruption is in an index,
i.e. metadata, that can be rebuilt.  If the offline defrag fails then
restoring from backup or repromoting will be your next step.

 

Thanks,

 

-Steve

 









From:
[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ayers, Diane
Sent: Friday, August 19, 2005 6:43
PM
To: ActiveDir@mail.activedir.org
Subject: RE: [ActiveDir] Database
Corruption



 

My preferred
approach would be to demote the box to member server and re-promote to a domain
controller to ensure a good fresh copy of the DIT.  YMMV as the specific
requirements at your location may prevent this.  We have only run into
this once early in our AD days and this was the approach we used with good
success.

 

Diane

 







From:
[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Fontana
Sent: Friday, August 19, 2005 3:29
PM
To: ActiveDir@mail.activedir.org
Subject: [ActiveDir] Database
Corruption

Started getting the error below a few weeks ago on one of
our DCs.  My first reaction is to run a non-auth restore from a day before
this started happening and let replication take care of everything else. 
Any reason NOT to do this?  I’m concerned that this may happen again
and wasn’t able to find anything specific to the error below. 
Besides calling PSS any thing else I should look into before restoring? 
This box holds all FSMO roles, Win2k3, server for NIS.

 

TIA

 

-alex

 

Event Type:   Error

Event Source:    NTDS ISAM

Event Category: Database Page Cache 

Event
ID:   475

Date:   
8/19/2005

Time:   
2:00:24 PM

User:   
N/A

Computer: DC

Description:

NTDS (528) NTDSA: The database page read from the file
"C:\WINNT\NTDS\ntds.dit" at offset 665067520 (0x27a42000) for
8192 (0x2000) bytes failed verification due to a page number
mismatch.  The expected page number was 81184 (0x00013d20) and the actual
page number was 2349964126 (0x8c119b5e).  The read operation will fail
with error -1018 (0xfc06).  If this condition persists then please
restore the database from a previous backup. This problem is likely due to faulty
hardware. Please contact your hardware vendor for further assistance diagnosing
the problem.

 








RE: [ActiveDir] Database Corruption

2005-08-19 Thread Ayers, Diane



My preferred approach would be to 
demote the box to member server and re-promote to a domain controller to ensure 
a good fresh copy of the DIT.  YMMV as the specific requirements at your 
location may prevent this.  We have only run into this once early in our AD 
days and this was the approach we used with good success.
 
Diane


From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Alex 
FontanaSent: Friday, August 19, 2005 3:29 PMTo: 
ActiveDir@mail.activedir.orgSubject: [ActiveDir] Database 
Corruption


Started getting the error below a 
few weeks ago on one of our DCs.  My first reaction is to run a non-auth 
restore from a day before this started happening and let replication take care 
of everything else.  Any reason NOT to do this?  I’m concerned that 
this may happen again and wasn’t able to find anything specific to the error 
below.  Besides calling PSS any thing else I should look into before 
restoring?  This box holds all FSMO roles, Win2k3, server for NIS.
 
TIA
 
-alex
 
Event 
Type:   Error
Event Source:    NTDS 
ISAM
Event Category: Database Page Cache 

Event 
ID:   
475
Date:    
8/19/2005
Time:    
2:00:24 PM
User:    
N/A
Computer: 
DC
Description:
NTDS (528) NTDSA: The database page 
read from the file "C:\WINNT\NTDS\ntds.dit" at offset 665067520 
(0x27a42000) for 8192 (0x2000) bytes failed verification due to a 
page number mismatch.  The expected page number was 81184 (0x00013d20) and 
the actual page number was 2349964126 (0x8c119b5e).  The read operation 
will fail with error -1018 (0xfc06).  If this condition persists then 
please restore the database from a previous backup. This problem is likely due 
to faulty hardware. Please contact your hardware vendor for further assistance 
diagnosing the problem.
 


[ActiveDir] Database Corruption

2005-08-19 Thread Alex Fontana








Started getting the error below a few weeks ago on one of
our DCs.  My first reaction is to run a non-auth restore from a day before
this started happening and let replication take care of everything else. 
Any reason NOT to do this?  I’m concerned that this may happen again
and wasn’t able to find anything specific to the error below. 
Besides calling PSS any thing else I should look into before restoring?  This
box holds all FSMO roles, Win2k3, server for NIS.

 

TIA

 

-alex

 

Event Type:   Error

Event Source:    NTDS ISAM

Event Category: Database Page Cache 

Event ID:   475

Date:    8/19/2005

Time:    2:00:24
PM

User:    N/A

Computer: DC

Description:

NTDS (528) NTDSA: The database page read from the file
"C:\WINNT\NTDS\ntds.dit" at offset 665067520 (0x27a42000) for
8192 (0x2000) bytes failed verification due to a page number
mismatch.  The expected page number was 81184 (0x00013d20) and the actual
page number was 2349964126 (0x8c119b5e).  The read operation will fail
with error -1018 (0xfc06).  If this condition persists then please
restore the database from a previous backup. This problem is likely due to
faulty hardware. Please contact your hardware vendor for further assistance
diagnosing the problem.