On 05/20/2012 02:28 AM, Gelen James wrote:
rebuild the old IPA master A is half success too. The error also happens at CA replication side.

After replica preparation at replica B, nuke and reinstall old A, and create A from the replica info file prepared on B, The user LDAP replication works fine. while the CA replication broken terribly. the error messages on A inside file /var/log/dirsrv/slapd-PKI-IPA/errors are pasted below:

[20/May/2012:01:17:36 -0700] - 389-Directory/1.2.9.16 B2012.023.214 starting up [20/May/2012:01:17:36 -0700] NSMMReplicationPlugin - replica_check_for_data_reload: Warning: data for replica o=ipaca does not match the data in the changelog (replica data (4fb8a7f3000404430000) > changelog (4fb84ba7000000560000)). Recreating the changelog file. This could affect replication with replica's consumers in which case the consumers should be reinitialized.

This error message is normal - you should only see this once, just after a replica has been initialized.

[20/May/2012:01:17:37 -0700] - slapd started. Listening on All Interfaces port 7389 for LDAP requests [20/May/2012:01:17:37 -0700] - Listening on All Interfaces port 7390 for LDAPS requests
[root@<A> ~]#

check the RUV records shows a number too big: 1091, while all others are smaller than 100.

It's not "too big" as far as the protocol is concerned, but it is strange that it is so much larger than the other values.


There are no RUV records to delete/clear.

dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,o=ipaca
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 4fb8187f000000600000
nsds50ruv: {replica 97 ldap://B.example.com:7389} 4fb81886000000
 610000 4fb8a7ca000100610000
nsds50ruv: {replica 1091 ldap://A.example.com:7389} 4fb8a7c60001044
 30000 4fb8a8a9000104430000
nsds50ruv: {replica 91 ldap://C.example.com:7389} 4fb81f54000000
 5b0000 4fb84db60000005b0000
nsds50ruv: {replica 86 ldap://D.example.com:7389} 4fb821a6000000
 560000 4fb84ba7000000560000
o: ipaca
nsruvReplicaLastModified: {replica 97 ldap://B.example.com:7389}
  4fb8a7c7
nsruvReplicaLastModified: {replica 1091 ldap://A.example.com:7389}
 4fb8a8a6
nsruvReplicaLastModified: {replica 91 ldap://C.example.com:7389}
  00000000
nsruvReplicaLastModified: {replica 86 ldap://D.example.com:7389}
  00000000

Please advise. Thanks.

--Gelen






------------------------------------------------------------------------
*From:* Gelen James <hahaha_...@yahoo.com>
*To:* Rob Crittenden <rcrit...@redhat.com>; Dmitri Pal <d...@redhat.com>
*Cc:* "Freeipa-users@redhat.com" <Freeipa-users@redhat.com>
*Sent:* Sunday, May 20, 2012 12:08 AM
*Subject:* Re: [Freeipa-users] Please help: How to restore IPA Master/Replicas from daily IPA Replica setup???

Hi Mmitri, Rob and all.

Thanks for your instructions. I've performed your steps on case#1: replacing failed IPA master. The results, and my confusion and questions, are all detailed below. In general, please setup your own real test environment, and write down the detailed steps one by one clearly.

It took me more than one week and still no clues. Frankly, your steps in the formal email are kind of over-simplified for normal IPA users, and not covering how the CA LDAP backend will be handled.

The problem is the CA backend. All the replicas still trying to sync to old failed IPA master, even after reboot.

Could be that the 'ipa-replica-manage' only manages the user data replication? and 'ipa-csreplica-manage' only handles CA-end replication? In other words, when build, or tear down, IPA replication between two servers, do we need to deal with both replication types with 'ipa-replica-mange' AND 'ipa-csreplica-manage'? If so, then why who should run first?

The error messages in /var/log/dirsrv/slapd-PKI-IPA/errors are attached, same from B,C,D replicas.

[19/May/2012:19:40:48 -0700] - 389-Directory/1.2.9.16 B2012.023.214 starting up [19/May/2012:19:40:48 -0700] - slapd started. Listening on All Interfaces port 7389 for LDAP requests [19/May/2012:19:40:48 -0700] - Listening on All Interfaces port 7390 for LDAPS requests [19/May/2012:19:40:50 -0700] slapi_ldap_bind - Error: could not send startTLS request: error -1 (Can't contact LDAP server) [19/May/2012:19:40:50 -0700] NSMMReplicationPlugin - agmt="cn=cloneAgreement1-B.example.com-pki-ca" (<A>:7389): Replication bind with SIMPLE auth failed: LDAP error -1 (Can't contact LDAP server) ((null)) [19/May/2012:19:40:57 -0700] slapi_ldap_bind - Error: could not send startTLS request: error -1 (Can't contact LDAP server) [19/May/2012:19:41:03 -0700] slapi_ldap_bind - Error: could not send startTLS request: error -1 (Can't contact LDAP server) [19/May/2012:19:41:15 -0700] slapi_ldap_bind - Error: could not send startTLS request: error -1 (Can't contact LDAP server) [19/May/2012:19:41:39 -0700] slapi_ldap_bind - Error: could not send startTLS request: error -1 (Can't contact LDAP server) [19/May/2012:19:42:27 -0700] slapi_ldap_bind - Error: could not send startTLS request: error -1 (Can't contact LDAP server) [19/May/2012:19:44:03 -0700] slapi_ldap_bind - Error: could not send startTLS request: error -1 (Can't contact LDAP server) [19/May/2012:19:47:15 -0700] slapi_ldap_bind - Error: could not send startTLS request: error -1 (Can't contact LDAP server)
[root@<B> ~]#

After seeing the above messages, I tried to run similar commands for CA replication, it shows that replication agreement (which replication agreement? User data, or CA data ?? ) exists already.

on B,
ipa-csreplica-manage connect C
ipa-csreplica-manage connect D
ipa-csreplica-manage del A --force
ipactl restart

on C,
ipa-csreplica-manage del A --force
ipactl restart

on D,
ipa-csreplica-manage del A --force
ipactl restart


[root@B ~]# ipa-csreplica-manage --password=xxxxxxx connect C.example.com <http://C.example.com>
This replication agreement already exists.
[root@B ~]#

[root@B ~]# ipa-csreplica-manage --password=xxxxxxx connect D.example.com <http://D.example.com>
This replication agreement already exists.
[root@B ~]#

[root@B ~]# ipa-csreplica-manage --password=xxxxxxx del C.example.com --force Unable to connect to replica A.example.com <http://A.example.com>, forcing removal
Failed to get data from 'A.example.com': Can't contact LDAP server
Forcing removal on 'B.example.com <http://B.example.com>'
[root@B ~]#

....

After restarting IPA services on B, C, D, and now the error messages finally got away from CA errors log file.

But we still can not find the CA replication setups. Please see the difference of output from 'ipa-replica-manage' and 'ipa-csreplica-manage':

[root@B ~] ipa-replica-manage list
B.example.com
C.example.com
D.example.com

[root@B ~] ipa-csreplica-manage list
B.example.com
C.example.com
D.example.com

[root@B ~] ipa-replica-manage list B.example.com
C.example.com
D.example.com

[root@B ~] ipa-csreplica-manage list B.example.com
## Nothing at all!

Please have a check and give correct command and sequences for us IPA users. It is such a pain to spend so much time and still can not get restoration work as expected. Even worse is, Have no idea how the 'ipa-replica-manage' and 'ipa-csreplica-manage' work together behind the scene.

Thanks a lot.

--Gelen




------------------------------------------------------------------------
*From:* Rob Crittenden <rcrit...@redhat.com>
*To:* Robinson Tiemuqinke <hahaha_...@yahoo.com>
*Cc:* "Freeipa-users@redhat.com" <Freeipa-users@redhat.com>; Rich Megginson <rmegg...@redhat.com>; Dmitri Pal <d...@redhat.com>
*Sent:* Tuesday, May 15, 2012 9:57 AM
*Subject:* Re: [Freeipa-users] Please help: How to restore IPA Master/Replicas from daily IPA Replica setup???

Robinson Tiemuqinke wrote:
> Hi Dmitri, Rich and all,
>
> I am a newbie to Redhat IPA, It looks like pretty cool compared with
> other solutions I've tried before. Thanks a lot for this great product! :)
>
> But there are still some things I needs your help. My main question is:
> How to restore the IPA setup with a daily machine-level IPA Replica backup?
>
> Please let me explain my IPA setup background and backup/restore goals
> trying to reach:
>
> I'm running IPA 2.1.3 on Redhat Enterprise 6.2. The IPA master is setup
> with Dogtag CA system. It is installed first. Then two IPA replicas are
> installed -- with '--setup-ca' options -- for load balancing and
> failover purposes.
>
> To describe my problems/objectives, I'll name the IPA Master as machine
> A, IPA replicas as B and C. and now I've one more extra IPA replica 'D'
> (virtual machine) setup ONLY for backup purposes.
> The setup looks like the following, A is the configuration Hub. B,C,D
> are siblings.
>
> A
> / | \
> B C D
>
> The following are the steps I backup IPA setups and LDAP backends daily
> -- it is a whole machine-level backup (through virtual machine D).
>
> 1, First, IPA replica D is backed up daily. The backup happens like this:
>
> 1.1 on IP replica D, run 'service IPA stop'. Then run 'shutdown -h <D>'.
> On the Hypervisor which holds virtual machine D, do a daily backup of
> the whole virtual disk that D is on.
> 1.2 turn on the IP replica D again.
> 1.3 after virtual machine D is up, on D optionally run a
> 'ipa-replica-manage --force-sync --from <A>' to sync the IPA databases
> forcibly.
>
> Now comes to restore part, which is pretty confusing to me. I've tried
> several times, and every times it comes this or that kinds of issues and
> so I am wondering that correct steps/ineraction of IPA Master/replicas
> are the king :(
>
> 2, case #1, A is broken, like disc failure, and then re-imaged after
> several days.
>
> 2.1 How to rebuild the IPA Master/Hub A after A is re-imaged, with the
> daily backup from IPA replica D?

The first thing you'll need to do is to connect your other replias
together, either by picking a new hub or adding links to each one. Then
you'll need to delete the replication agreement to A. You should be left
with a set of servers that continues to replicate.

So, for arguments sake, we promote B to be the new hub:

On B:

# ipa-replica-manage connect C
# ipa-replica-manage connect D
# ipa-replica-manage del --force A
# ipactl restart

On C:

# ipa-replica-manage del --force A
# ipactl restart

On D:

# ipa-replica-manage del --force A
# ipactl restart

It is unclear what you mean by re-imaged. Are you restoring from backup
or installing it fresh? I'll assume it is a new install. You'll need to
prepare a replica file for A and install it as a replica. Then if you
want to keep A as the primary you'll need to change the replication
agreements back to it is the hub (using ipa-replica-manage connect and
disconnect).

When you install the new A server it should get all the changes needed,
you should be done.

You'll want to check the documentation on promoting a master to verify
that only one server is the CRL generator (at this point there may be none).

> 2.2 do I have to check some files on A into subversion immediately after
> A was initially installed?

The only thing you really need to save is the cacert.p12 file. This is
your root CA.

> 2.3 Please describe the steps. I'll follow exactly and report the results.
>
> 3, case #2, A is working, but either B, or C is broken.
>
> 3.1 It looks that I don't need the daily backup of D to kick in, is that
> right?

No, D is unrelated.

> 3.2 What are the correct steps on A; and B after it is re-imaged?

On A:
# ipa-replica-manage del B
# ipactl restart
# ipa-replica-prepare B

On B
# ipa-replica-install B

You'll probably need/want to clean RUV,
http://directory.fedoraproject.org/wiki/Howto:CLEANRUV

> 3.3 Please describe the steps. I'll follow exactly and report the results.
>
> 4, case #3, If some un-expected IPA changes happens on A -- like all
> users are deleted by human mistakes --, and even worse, all the changes
> are propagated to B and C in minutes.
>
> 4.1 How can I recover the IPA setup from daily backup from D?

We have not yet documented how to recover from tombstones or an offline
replica.

> 4.2 which IPA master/replicas I should recover first? IPA master A, or
> IPA replicas B/C? and then how to recover others left one by one?

If the entries are re-added on any of the replicas it will be propogated
out.

> 4.3 Do I have to disconnect replication agreement of B,C,D from A first?

Depends on how 4.1 gets answered which we are still investigating.

> 4.4 Please describe the steps. I'll follow exactly and report the results.
>
> I've heard something about tombstone records too, Not sure whether the
> problem still exists in 2.1.3, or 2.2.0(on 6.3Beta)? If so, How can I
> avoid it with correct recovery steps/interactions.

It is RUV that is the problem. This 389-ds wiki page describes how to
clean up: http://directory.fedoraproject.org/wiki/Howto:CLEANRUV

The 389-ds team is working to make this less manual.

rob



_______________________________________________
Freeipa-users mailing list
Freeipa-users@redhat.com <mailto:Freeipa-users@redhat.com>
https://www.redhat.com/mailman/listinfo/freeipa-users



_______________________________________________
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

_______________________________________________
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Reply via email to