There are three types of replication when you are talking
about passwords .
1. Urgent replication. This is when a password changes
anywhere, it sends out an urgent replication notification (i.e no hold back
on the notification, it goes now - use my adqueueloop to watch for it). Again,
the last time I looked, the priority is the same as any normal partition change
(i.e. higher than a GC change but no higher than say the change of description
for a default partition). So basically this goes into the queue and zips about
the site (or anywhere change notification is enabled) within a fairly quick
way, then stops dead when it hits the site walls and waits for the site link
configurations (again change notification configured for site links can modify
this). I actually mention that below. This type of replication uses all of the
normal replication mechanisms so if your inbound thread is tied up on a DC with
lots of default partition changes you could see any types of delay getting
the change around (this is why you monitor DRA Pending - it needs to go to
zero every replication period).
2. Immediate Replication. This is when a PDC is contacted
via a specific RPC call to update the password when it is changed on another DC.
This does not use the normal replication engine so isn't impacted by normal
replication delays.This functionality has been in there since OEM. It is best
effort, it will try to get the change back to the PDC but if something prevents
it (busy PDC, network issues, dead PDC, etc) then the change just goes through
the normal #1 replication. Also once again, AvoidPDConWAN setting impacts
this, it completely disables it unless the PDC is in the same site as the DC
where the password occurred. If you are NOT seeing this, I highly recommend
auditing the DCs to make sure that the reg setting isn't set and that your PDC
is working correctly and that network is ok.
3. Single User Object On Demand Replication or simply On
Demand Replication. This is when a PDC chaining event occurs, the PDC
immediately pushes the user info down to the DC that did the chaining. This is
what 812499 adds to the mix (Also K3 RC1). This is not using the standard
replication engine, this is completely out of band. This functionality does not
exist pre-812499, it is a bug fix (well they didn't consider a bug but everyone
who has had to deal with it has). It was a hole in the design and wasn't the
intended customer experience. Replication delays will not impact this, however
AvoidPDConWAN setting would because you don't chain when that is set. See the
section of the previously mentioned doc called Single User Object On Demand
Replication.
Locked out accounts again, this is a different story. I
recall reading the diffs previously but don't have them on the tip of my tongue.
PDC chaining does not occur in the same way for a locked out account. There is a
difference. Account lockouts really shouldn't be happening a lot to normal users
and not at all to admins unless they
1. Have an old crappy client
2. Have some bad software that does stupid
things (old versions of outlook with
an expired account for instance can generate hundreds of auths a
second)
3. Are being attacked.
4. Are a bonehead
or you
1. Have the lockout policy set to some insane setting (like
3 bads and locked forever or for an hour or whatever, you want 3 bads and a
lockout, fine, unlock in 5 min then).
All of those are correctable. I think you were around when
I got in a fight with HP's first level folks because I took away their ability
to unlock each others accounts. They said they needed it because they kept
getting locked out. My response was they needed to get a little smarter and be
careful and actually know what they are doing versus just bein a clicking bump
on the log (man did I get in trouble for that one...). That was a very unhappy
fight for them and they lost but the number of lockouts on that team went down
drammatically.
If you have a lockout policy that tends towards locking out
valid users, it needs to be reviewed. The concept of the lockout policy is to
prevent cracking of passwords due to enough password attempts making it through.
Careful control of the lockout policy tied with the password policy is how this
is done correctly. You can turn up how many bads it takes to get a lockout and
turn down how fastit unlocks automatically if you have a decent password policy.
The longer the password policy the higher the lockout bad count can be. We have
a policy standard of 5 bads which I think is ridiculously low. Due to bugs in
Win9x it is currently set at 15 bads which is more realistic. Unlock time is 15
minutes as well. This means there could be ~60 attempts an hour which shouldn't
be enough to compromise any decent password in any real time. Note if you
have avoidpdconwan set, you have a possible security issue here which you should
be thinking about and testing - especially if you have a lot of DCs that are all
on the WAN and reachable from a single location.
If I had my druthers for a normal corporare
environment, I would see password policies of like 20 bad, unlock in 15.
Passwords of 15 characters or better. Simple complexity, BOTH upper and
lower case. Obviously the MS Complexity filter doesn't fit that, but you can get
that out of products like PSYNCH or just write your own filter. Passwords are
changed at least every 84-91 days (multiple of 7) - NO NON-EXPIRING IDS EVER.
Admin passwords changed every 30 days and this shouldn't have to be enforced by
the system, you should be able to tell your admins, hey make sure your passwords
don't go over 30 days - they shouldn't be logging on interactively to their
workstations so they shouldn't be getting notifications for them most of the
time anyway when they are approaching expiration. Admin passwords should NOT be
in sync with normal passwords and probably should be longer than 15 characters,
people start thinking pass phrases... Obviously the longer passwords won't work
in an environment that you insist on keeping mainframes and other systems that
can only support short passwords and you insist on syncing your IDs instead
of using say kerberos authentication across platforms...
joe
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of deji Agba
Sent: Friday, April 30, 2004 1:34 AM
To: [EMAIL PROTECTED]
Subject: RE: [ActiveDir] Replication issues
The password will get
replicated "out of band" [1] back to the PDC on a
password change. See
http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/
security/bpactlck.mspx, specifically check the piece on "immediate
replication".
password change. See
http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/
security/bpactlck.mspx, specifically check the piece on "immediate
replication".
I missed this. Let's hope I don't get
smacked too hard for it. But, are you saying password change qualifies for
"immediate" (or urgent) replication? Not according to this:
By default, urgent replication does not occur across site
boundaries. Because of this, administrators should make manual password changes
and account resets on a domain controller that is in that user's site.
This is what acctinfo addressed. This was the problem I was facing
a year ago. My helpdesk admins in Santa Clara reset an EMEA (or
Tokyo) user's password. They call up the user and say "here's your
password", user tries it and hits the lockout threshold, BAM! user is locked
out. User gets really PO'ed because now he can't get helpdesk, because helpdesk
had left for the day shortly after calling user. I unlock user's account, which
now triggers urgent replication, tell user "wait for about 5-10 minutes and try
it". User is then able to login and make that million dollars sales
presentation. I get bonus, and I'm still employed because I'm the "Guru".
Helpdesk get the shaft and they are pissed at me for not telling them about
this "feature".
Now, I will shut up. Really :)
Sincerely,
Dèjì Akómöláfé, MCSE MCSA MCP+I
Dèjì Akómöláfé, MCSE MCSA MCP+I
Microsoft MVP
- Directory Services
www.readymaids.com - we know
IT
www.akomolafe.com
Do you now realize that Today is the Tomorrow you were worried about Yesterday? -anon
www.akomolafe.com
Do you now realize that Today is the Tomorrow you were worried about Yesterday? -anon
From: joe
Sent: Thu 4/29/2004 3:43 PM
To: [EMAIL PROTECTED]
Subject: RE: [ActiveDir] Replication issues
The password will get replicated "out of band" [1] back to the PDC on a password change. See http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/ security/bpactlck.mspx, specifically check the piece on "immediate replication". "Theoretically, there should be no need for these tools, but in reality, chaining did not work as designed." Yes it actually does, I see it in action every single day. We process thousands of password requests a day. It does work. Wherever the password is changed, it gets back to the PDC and then whatever DC is hit, the request is chained back to the PDC to allow the authentication. "before the locking out DC learns about the reset." Lockouts are handled differently. Dig into the documentation. An unlock has some special stuff around it in terms of how often it will go back and check. I don't recall the details, however, not every attempt is sent back to the PDC when the account is locally locked. I believe the logic was put in to protect the PDC from DOSed from things like viruses and such that pound the DCs. The "AvoidPDConWAN" will of course change the default functionality, that is what it was designed to do. If someone blindly applied it without understanding the repercussions, they deserve everything that happens to them. See http://support.microsoft.com/default.aspx?scid=kb;EN-US;232690 / http://support.microsoft.com/?kbid=225511 for more info on AvoidPDConWan setting. One other thing I want to point out that is usually documented horribly. Password changes are urgently replicated within a site, not to all domain controllers. So if you change a password, you will go through urgent notification (i.e. bypassing the holdback time) within the site and those DCs will replicate in an urgent manner [2]. Once you hit site boundaries that are living with normal site link replication periods then you wait for that replication period to come up to get that password sent across. So if you have a 4 day wait on the link, then you wait that long to get that replication through. If you don't have avoidpdconwan set though and you have good connectivity, this will not be an issue. If you do, the very fact that you set that setting means you WANT to have to go change the password on the DC the user is using. In a simple environment this is a trivial thing to work out (assuming proper configuration everywhere). In a large complex environment this can be decidely non-trivial. joe [1] A specific RPC call is made. I have seen this in action with one of my tools that watches DCs for changes and notifies on object modifications. The longest delay I have seen has been about 500ms. However if the PDC is for some reason unavailable, this call will fail and the password will get back to the PDC through the standard replication methods. [2] I don't believe however that the priority is any higher than any other domain context change, just simply the notification is urgent which means that if there is a queue on the inbound thread on what it is working on, it will get thrown at the bottom of the items with the same priority. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, April 28, 2004 7:30 PM To: [EMAIL PROTECTED] Subject: RE: [ActiveDir] Replication issues >>It will get that password back immediately unless the PDC is really >>busy or otherwise unavailable The way I'm reading this is that you are saying password change will trigger immediate replication to the PDCE. Iin my experience (which I don't have to describe to you :)), this is not the case. Also, I may be misreading you here, because, further now, you said: >>What SHOULD happen is that the local DC should realize, hey this >>password isn't correct and will do what is called a PDC Chaining to ask the PDC what if the password specified is in fact ok [3] This is the way it works, I agree here. Now, you also said: >>Assuming the PDC is available to that site, you should be able to >>change a password anywhere on any DC and that password will get back to the DC. This, too, is correct. However the problem is the time it takes for the password change to get back to the PDCE and then onward to the rest of the DC. Where neither the HelpDesk (wo reset the password) no the User (whose password was reset) is in the site where the PDCE is located, the length of time it takes for the password change to travel across the wire is usually unacceptble. This is the reason one wuld want to reset the password at a DC local to the User. This is also one of the reasonss for ALToos, especially the AcctInfo.dll part. Theoretically, there should be no need for these tools, but in reality, chaining did not work as designed. One DC would lock out a user's account, after the user's password had been reset on another DC, before the locking out DC learns about the reset. Lastly, I have come across canned recommendations from "security consultants" telling clients to enable AvoidPDConWAN registry key. I am sure some companies would have heeded that recommendation. Sincerely, Dèjì Akómöláfé, MCSE MCSA MCP+I Microsoft MVP - Directory Services www.readymaids.com - we know IT www.akomolafe.com Do you now realize that Today is the Tomorrow you were worried about Yesterday? -anon ________________________________ From: [EMAIL PROTECTED] on behalf of joe Sent: Wed 4/28/2004 4:47 AM To: [EMAIL PROTECTED] Subject: RE: [ActiveDir] Replication issues 1. What do you think your replication latency is supposed to be based upon your knowledge of your topology and your link configurations? This isn't something you have to guess at. Look at your DC placement and your replication topology and it will tell you the exact theoretical max replication period you have. 2. What do you want it to be? 30-60 minutes would be a time frame for replication that means you changed the default link settings. The default it 180 minutes per link (hop). This can be reduced to as low as 15 minutes without change notification and if you enable change notification it can go down to seconds (based on how busy the bridge heads between the sites are). As a rule, people don't generally set up change notification across a WAN [1]. 30-60 minutes could mean that you have 2-4 hops to get to the site with 15 minute delays or it could be you have 1-2 hops with 30 minute delays or it could be 1-2 hops with 15 minute delays with lots of DCs in each site and it taking 15 minutes to get to the proper outgoing bridgehead for each site. Lots of valid reasons for the timing, you need to understand what your theretical maxes could be and then decide if you are outside of that. If outside of that the first thing I would do is look at my DRA Pending Queue on my servers in the replication path to make sure it was zeroing out every replication period. [2] One thing I saw below I wanted to speak about... The out of band password force back to the PDC has been in W2K since RTM at least. It will get that password back immediately unless the PDC is really busy or otherwise unavailable (down, net down, PacMan on the ethernet line eating all of the packets, etc). Now after all of this I will say you should NOT have to worry about changing passwords at the specific site. Assuming the PDC is available to that site, you should be able to change a password anywhere on any DC and that password will get back to the DC. Then the client should be able to log on ANYWHERE. What SHOULD happen is that the local DC should realize, hey this password isn't correct and will do what is called a PDC Chaining to ask the PDC what if the password specified is in fact ok [3]. Assuming the password is ok, the PDC will say, that is fine and let the user log on. This functionality has been in Windows all the way back in NT. Without it, life in large companies would be miserable. Now there has been change in the functionality since 2K RTM to fix what I consider a design flaw / bug in this process. I can't recall when that exactly went in for 2K (SP3?) but was in K3 RC1; I have written previously about this fix on this list. Basically the issue was if the user needed to change the password on the next logon and the PDC chaining event occurred, the logon would succeed and client would be told to display the change password dialogue. The user would respond and use the "old password" of the password they just used to logon. Since that password wasn't yet at the local DC that was handling this change password request the local DC would say that the old password was incorrect and reject the change. I have already speculated in previous posts to this list about what was happening. Basically it was fixed by sending back key information to the remote DC during a PDC Chaining operation that brought that DC up to date for some critical authentication information so that it did indeed have the latest password information for that user. So all of that to say, that unless you have horrendous network connectivity, you should not have to set passwords on specific DCs if you are up to the current patch levels of Windows 2000 or on Windows 2003 for your domain controllers. joe [1] There are exceptions here so I am not looking for people to email say, we are and here'e why... There are a couple of special cases where I do it as well - to keep exchange in a good mood. The exceptions make the rule and show the beauty of the flexibility of the system. [2] Keep in mind there was a bug in a hotfix or two between SP2-3 that caused this queue to not have good values. It would increment sometimes and exit without remembering to decrement. Very unusual as it will look almost like you queue isn't clearing. In this case, you can pull out repadmin /queue or my adqueueloop to look at the actual queue and verify what it is doing. This is fixed in SP4 and actually one of the 4 new hotfixes that just came out also corrects it (obviously the bin with that code was replaced in one of the fixes and it has all of the previous fixes in it as well). So if you are at the minimum you should be for these last three crits, your counters should be working ok. [3] The DCs realize that they may not have the latest password and go ask the "master" for verification. This is one of the "big" functions of the PDC, being "master" of the passwords. It may not have the current right password, but it is final arbiter on whether or not a certain password can be used if another DC isn't sure. ________________________________ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Coleman, Hunter Sent: Tuesday, April 27, 2004 10:46 AM To: '[EMAIL PROTECTED]' Subject: RE: [ActiveDir] Replication issues It's strictly a judgment call. You decide how important it is to have password changes replicate *now* and then weigh that against the costs of having very low replication latency. Costs might include available bandwidth, other applications using the same network, etc... In general, I'd stay away from letting this be the driving factor in determining your replication schedule. Change the password in the user's site, and 99% of the time the user should be fine within 15 minutes (default intrasite maximum replication period if you have 5 or more DCs in the site) or less. ________________________________ From: Rimmerman, Russ [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 27, 2004 7:40 AM To: '[EMAIL PROTECTED]' Subject: RE: [ActiveDir] Replication issues What does changing the replication schedules explicitly for password resets entail, and is it recommended? ________________________________ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Coleman, Hunter Sent: Tuesday, April 27, 2004 8:25 AM To: '[EMAIL PROTECTED]' Subject: RE: [ActiveDir] Replication issues Unless you want to start changing your replication schedules explicitly for password resets, you're doing the right thing. Change the password on a DC in the user's site. If you're at SP4 (I think, could have been SP3) then the password change will also get sent on to the PDC emulator immediately. Anytime a user enters an incorrect password, the local DC will pass on the request to the PDCE in case the password had changed on a different DC. The Account Lockout Status tool is probably the best utility for checking on password replication. Among other things, it will show the timestamp for password last set on each domain controller, so you can have a good idea of the replication state on the change. http://www.microsoft.com/downloads/details.aspx?FamilyID=d1a5ed1d-cd55-4829- a 189-99515b0e90f7&DisplayLang=en (watch for URL wrap) Hunter ________________________________ From: Rimmerman, Russ [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 27, 2004 7:07 AM To: '[EMAIL PROTECTED]' Subject: [ActiveDir] Replication issues We have always been having weird issues with replication. We have about 30 AD sites all over the world. When we change or reset a password here for a user at a remote site, it takes quite a long time (30-60 minutes or more) to replicate to the users site. So, we are having to connect to their local domain contoller and reset the password there. What is the best practice for setting up and tuning replication and resetting passwords, and what tools are recommended (replmon?) for "testing" it, and how long should it take? List info : http://www.activedir.org/mail_list.htm List FAQ : http://www.activedir.org/list_faq.htm List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/mail_list.htm List FAQ : http://www.activedir.org/list_faq.htm List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/