Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/17/2012 04:15 PM, Rob Crittenden wrote: > Martin Kosek wrote: >> On 09/17/2012 04:04 PM, Rob Crittenden wrote: >>> Martin Kosek wrote: On 09/14/2012 09:17 PM, Rob Crittenden wrote: > Martin Kosek wrote: >> On 09/06/2012 11:17 PM, Rob Crittenden wrote: >>> Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: > Rob Crittenden wrote: >> Rob Crittenden wrote: >>> Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: > Rob Crittenden wrote: >> Martin Kosek wrote: >>> On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: > On 07/03/2012 04:41 PM, Rob Crittenden wrote: >> Deleting a replica can leave a replication vector (RUV) on >> the >> other servers. >> This can confuse things if the replica is re-added, and it >> also >> causes the >> server to calculate changes against a server that may no >> longer >> exist. >> >> 389-ds-base provides a new task that self-propogates itself >> to all >> available >> replicas to clean this RUV data. >> >> This patch will create this task at deletion time to >> hopefully >> clean things up. >> >> It isn't perfect. If any replica is down or unavailable at >> the >> time >> the >> cleanruv task fires, and then comes back up, the old RUV data >> may be >> re-propogated around. >> >> To make things easier in this case I've added two new >> commands to >> ipa-replica-manage. The first lists the replication ids of >> all the >> servers we >> have a RUV for. Using this you can call clean_ruv with the >> replication id of a >> server that no longer exists to try the cleanallruv step >> again. >> >> This is quite dangerous though. If you run cleanruv against a >> replica id that >> does exist it can cause a loss of data. I believe I've put in >> enough scary >> warnings about this. >> >> rob >> > > Good work there, this should make cleaning RUVs much easier > than > with the > previous version. > > This is what I found during review: > > 1) list_ruv and clean_ruv command help in man is quite lost. I > think > it would > help if we for example have all info for commands indented. > This > way > user could > simply over-look the new commands in the man page. > > > 2) I would rename new commands to clean-ruv and list-ruv to > make > them > consistent with the rest of the commands (re-initialize, > force-sync). > > > 3) It would be nice to be able to run clean_ruv command in an > unattended way > (for better testing), i.e. respect --force option as we > already > do for > ipa-replica-manage del. This fix would aid test automation in > the > future. > > > 4) (minor) The new question (and the del too) does not react > too > well for > CTRL+D: > > # ipa-replica-manage clean_ruv 3 --force > Clean the Replication Update Vector for > vm-055.idm.lab.bos.redhat.com:389 > > Cleaning the wrong replica ID will cause that server to no > longer replicate so it may miss updates while the process > is running. It would need to be re-initialized to maintain > consistency. Be very careful. > Continue to clean? [no]: unexpected error: > > > 5) Help for clean_ruv command without a required parameter is > quite > confusing > as it reports that command is wrong and not the parameter: > > # ipa-replica-manage clean_ruv >>>
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Martin Kosek wrote: On 09/17/2012 04:04 PM, Rob Crittenden wrote: Martin Kosek wrote: On 09/14/2012 09:17 PM, Rob Crittenden wrote: Martin Kosek wrote: On 09/06/2012 11:17 PM, Rob Crittenden wrote: Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which need
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/17/2012 04:04 PM, Rob Crittenden wrote: > Martin Kosek wrote: >> On 09/14/2012 09:17 PM, Rob Crittenden wrote: >>> Martin Kosek wrote: On 09/06/2012 11:17 PM, Rob Crittenden wrote: > Martin Kosek wrote: >> On 09/06/2012 05:55 PM, Rob Crittenden wrote: >>> Rob Crittenden wrote: Rob Crittenden wrote: > Martin Kosek wrote: >> On 09/05/2012 08:06 PM, Rob Crittenden wrote: >>> Rob Crittenden wrote: Martin Kosek wrote: > On 07/05/2012 08:39 PM, Rob Crittenden wrote: >> Martin Kosek wrote: >>> On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob >>> >>> Good work there, this should make cleaning RUVs much easier than >>> with the >>> previous version. >>> >>> This is what I found during review: >>> >>> 1) list_ruv and clean_ruv command help in man is quite lost. I >>> think >>> it would >>> help if we for example have all info for commands indented. This >>> way >>> user could >>> simply over-look the new commands in the man page. >>> >>> >>> 2) I would rename new commands to clean-ruv and list-ruv to make >>> them >>> consistent with the rest of the commands (re-initialize, >>> force-sync). >>> >>> >>> 3) It would be nice to be able to run clean_ruv command in an >>> unattended way >>> (for better testing), i.e. respect --force option as we already >>> do for >>> ipa-replica-manage del. This fix would aid test automation in >>> the >>> future. >>> >>> >>> 4) (minor) The new question (and the del too) does not react too >>> well for >>> CTRL+D: >>> >>> # ipa-replica-manage clean_ruv 3 --force >>> Clean the Replication Update Vector for >>> vm-055.idm.lab.bos.redhat.com:389 >>> >>> Cleaning the wrong replica ID will cause that server to no >>> longer replicate so it may miss updates while the process >>> is running. It would need to be re-initialized to maintain >>> consistency. Be very careful. >>> Continue to clean? [no]: unexpected error: >>> >>> >>> 5) Help for clean_ruv command without a required parameter is >>> quite >>> confusing >>> as it reports that command is wrong and not the parameter: >>> >>> # ipa-replica-manage clean_ruv >>> Usage: ipa-replica-manage [options] >>> >>> ipa-replica-manage: error: must provide a command [clean_ruv | >>> force-sync | >>> disconnect | connect | del | re-initialize | list | list_ruv] >>> >>> It seems you just forgot to specify the error message in the >>> command >>> definition >>> >>> >>> 6) When the remote replica is down, the clean_ruv comma
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Martin Kosek wrote: On 09/14/2012 09:17 PM, Rob Crittenden wrote: Martin Kosek wrote: On 09/06/2012 11:17 PM, Rob Crittenden wrote: Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/14/2012 09:17 PM, Rob Crittenden wrote: > Martin Kosek wrote: >> On 09/06/2012 11:17 PM, Rob Crittenden wrote: >>> Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: > Rob Crittenden wrote: >> Rob Crittenden wrote: >>> Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: > Rob Crittenden wrote: >> Martin Kosek wrote: >>> On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: > On 07/03/2012 04:41 PM, Rob Crittenden wrote: >> Deleting a replica can leave a replication vector (RUV) on the >> other servers. >> This can confuse things if the replica is re-added, and it also >> causes the >> server to calculate changes against a server that may no longer >> exist. >> >> 389-ds-base provides a new task that self-propogates itself to >> all >> available >> replicas to clean this RUV data. >> >> This patch will create this task at deletion time to hopefully >> clean things up. >> >> It isn't perfect. If any replica is down or unavailable at the >> time >> the >> cleanruv task fires, and then comes back up, the old RUV data >> may be >> re-propogated around. >> >> To make things easier in this case I've added two new commands to >> ipa-replica-manage. The first lists the replication ids of all >> the >> servers we >> have a RUV for. Using this you can call clean_ruv with the >> replication id of a >> server that no longer exists to try the cleanallruv step again. >> >> This is quite dangerous though. If you run cleanruv against a >> replica id that >> does exist it can cause a loss of data. I believe I've put in >> enough scary >> warnings about this. >> >> rob >> > > Good work there, this should make cleaning RUVs much easier than > with the > previous version. > > This is what I found during review: > > 1) list_ruv and clean_ruv command help in man is quite lost. I > think > it would > help if we for example have all info for commands indented. This > way > user could > simply over-look the new commands in the man page. > > > 2) I would rename new commands to clean-ruv and list-ruv to make > them > consistent with the rest of the commands (re-initialize, > force-sync). > > > 3) It would be nice to be able to run clean_ruv command in an > unattended way > (for better testing), i.e. respect --force option as we already > do for > ipa-replica-manage del. This fix would aid test automation in the > future. > > > 4) (minor) The new question (and the del too) does not react too > well for > CTRL+D: > > # ipa-replica-manage clean_ruv 3 --force > Clean the Replication Update Vector for > vm-055.idm.lab.bos.redhat.com:389 > > Cleaning the wrong replica ID will cause that server to no > longer replicate so it may miss updates while the process > is running. It would need to be re-initialized to maintain > consistency. Be very careful. > Continue to clean? [no]: unexpected error: > > > 5) Help for clean_ruv command without a required parameter is > quite > confusing > as it reports that command is wrong and not the parameter: > > # ipa-replica-manage clean_ruv > Usage: ipa-replica-manage [options] > > ipa-replica-manage: error: must provide a command [clean_ruv | > force-sync | > disconnect | connect | del | re-initialize | list | list_ruv] > > It seems you just forgot to specify the error message in the > command > definition > > > 6) When the remote replica is down, the clean_ruv command fails > with an > unexpected error: > > [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 > Clean the Replication Update Vector for > vm-055.idm.lab.bos.redhat.com:389 > > Cleaning the wrong replica ID will cause that server to no >>
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Martin Kosek wrote: On 09/06/2012 11:17 PM, Rob Crittenden wrote: Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force Directory Manager password: Unable to connec
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/06/2012 11:17 PM, Rob Crittenden wrote: > Martin Kosek wrote: >> On 09/06/2012 05:55 PM, Rob Crittenden wrote: >>> Rob Crittenden wrote: Rob Crittenden wrote: > Martin Kosek wrote: >> On 09/05/2012 08:06 PM, Rob Crittenden wrote: >>> Rob Crittenden wrote: Martin Kosek wrote: > On 07/05/2012 08:39 PM, Rob Crittenden wrote: >> Martin Kosek wrote: >>> On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob >>> >>> Good work there, this should make cleaning RUVs much easier than >>> with the >>> previous version. >>> >>> This is what I found during review: >>> >>> 1) list_ruv and clean_ruv command help in man is quite lost. I >>> think >>> it would >>> help if we for example have all info for commands indented. This >>> way >>> user could >>> simply over-look the new commands in the man page. >>> >>> >>> 2) I would rename new commands to clean-ruv and list-ruv to make >>> them >>> consistent with the rest of the commands (re-initialize, >>> force-sync). >>> >>> >>> 3) It would be nice to be able to run clean_ruv command in an >>> unattended way >>> (for better testing), i.e. respect --force option as we already >>> do for >>> ipa-replica-manage del. This fix would aid test automation in the >>> future. >>> >>> >>> 4) (minor) The new question (and the del too) does not react too >>> well for >>> CTRL+D: >>> >>> # ipa-replica-manage clean_ruv 3 --force >>> Clean the Replication Update Vector for >>> vm-055.idm.lab.bos.redhat.com:389 >>> >>> Cleaning the wrong replica ID will cause that server to no >>> longer replicate so it may miss updates while the process >>> is running. It would need to be re-initialized to maintain >>> consistency. Be very careful. >>> Continue to clean? [no]: unexpected error: >>> >>> >>> 5) Help for clean_ruv command without a required parameter is quite >>> confusing >>> as it reports that command is wrong and not the parameter: >>> >>> # ipa-replica-manage clean_ruv >>> Usage: ipa-replica-manage [options] >>> >>> ipa-replica-manage: error: must provide a command [clean_ruv | >>> force-sync | >>> disconnect | connect | del | re-initialize | list | list_ruv] >>> >>> It seems you just forgot to specify the error message in the >>> command >>> definition >>> >>> >>> 6) When the remote replica is down, the clean_ruv command fails >>> with an >>> unexpected error: >>> >>> [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 >>> Clean the Replication Update Vector for >>> vm-055.idm.lab.bos.redhat.com:389 >>> >>> Cleaning the wrong replica ID will cause that server to no >>> longer replicate so it may miss updates while the process >>> is running. It would need to be re-initialized to maintain >>> consistency. Be very careful. >>> Continue to clean? [no]: y >>> unexpected error: {'desc': 'Operations error'} >>> >>> >>> /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On Thu, 2012-09-06 at 17:17 -0400, Rob Crittenden wrote: > Martin Kosek wrote: > > On 09/06/2012 05:55 PM, Rob Crittenden wrote: > >> Rob Crittenden wrote: > >>> Rob Crittenden wrote: > Martin Kosek wrote: > > On 09/05/2012 08:06 PM, Rob Crittenden wrote: > >> Rob Crittenden wrote: > >>> Martin Kosek wrote: > On 07/05/2012 08:39 PM, Rob Crittenden wrote: > > Martin Kosek wrote: > >> On 07/03/2012 04:41 PM, Rob Crittenden wrote: > >>> Deleting a replica can leave a replication vector (RUV) on the > >>> other servers. > >>> This can confuse things if the replica is re-added, and it also > >>> causes the > >>> server to calculate changes against a server that may no longer > >>> exist. > >>> > >>> 389-ds-base provides a new task that self-propogates itself to all > >>> available > >>> replicas to clean this RUV data. > >>> > >>> This patch will create this task at deletion time to hopefully > >>> clean things up. > >>> > >>> It isn't perfect. If any replica is down or unavailable at the > >>> time > >>> the > >>> cleanruv task fires, and then comes back up, the old RUV data > >>> may be > >>> re-propogated around. > >>> > >>> To make things easier in this case I've added two new commands to > >>> ipa-replica-manage. The first lists the replication ids of all the > >>> servers we > >>> have a RUV for. Using this you can call clean_ruv with the > >>> replication id of a > >>> server that no longer exists to try the cleanallruv step again. > >>> > >>> This is quite dangerous though. If you run cleanruv against a > >>> replica id that > >>> does exist it can cause a loss of data. I believe I've put in > >>> enough scary > >>> warnings about this. > >>> > >>> rob > >>> > >> > >> Good work there, this should make cleaning RUVs much easier than > >> with the > >> previous version. > >> > >> This is what I found during review: > >> > >> 1) list_ruv and clean_ruv command help in man is quite lost. I > >> think > >> it would > >> help if we for example have all info for commands indented. This > >> way > >> user could > >> simply over-look the new commands in the man page. > >> > >> > >> 2) I would rename new commands to clean-ruv and list-ruv to make > >> them > >> consistent with the rest of the commands (re-initialize, > >> force-sync). > >> > >> > >> 3) It would be nice to be able to run clean_ruv command in an > >> unattended way > >> (for better testing), i.e. respect --force option as we already > >> do for > >> ipa-replica-manage del. This fix would aid test automation in the > >> future. > >> > >> > >> 4) (minor) The new question (and the del too) does not react too > >> well for > >> CTRL+D: > >> > >> # ipa-replica-manage clean_ruv 3 --force > >> Clean the Replication Update Vector for > >> vm-055.idm.lab.bos.redhat.com:389 > >> > >> Cleaning the wrong replica ID will cause that server to no > >> longer replicate so it may miss updates while the process > >> is running. It would need to be re-initialized to maintain > >> consistency. Be very careful. > >> Continue to clean? [no]: unexpected error: > >> > >> > >> 5) Help for clean_ruv command without a required parameter is quite > >> confusing > >> as it reports that command is wrong and not the parameter: > >> > >> # ipa-replica-manage clean_ruv > >> Usage: ipa-replica-manage [options] > >> > >> ipa-replica-manage: error: must provide a command [clean_ruv | > >> force-sync | > >> disconnect | connect | del | re-initialize | list | list_ruv] > >> > >> It seems you just forgot to specify the error message in the > >> command > >> definition > >> > >> > >> 6) When the remote replica is down, the clean_ruv command fails > >> with an > >> unexpected error: > >> > >> [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 > >> Clean the Replication Update Vector for > >> vm-055.idm.lab.bos.redhat.com:389 > >> > >> Cleaning the wrong replica ID will cause that server to no > >> longer replicate so it may miss updates while the process > >> is running. It would need to be re-initialized to maintain > >> consistency. Be very careful. > >> Continue to clean? [no]: y > >>>
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force Directory Manager password: Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal Failed to
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On Thu, 2012-09-06 at 12:42 -0400, Mark Reynolds wrote: > > On 09/06/2012 12:27 PM, Martin Kosek wrote: > > On 09/06/2012 06:13 PM, Rich Megginson wrote: > >> On 09/06/2012 10:09 AM, Martin Kosek wrote: > >>> On 09/06/2012 06:09 PM, Martin Kosek wrote: > On 09/06/2012 06:05 PM, Martin Kosek wrote: > > On 09/06/2012 05:55 PM, Rob Crittenden wrote: > >> Rob Crittenden wrote: > >>> Rob Crittenden wrote: > Martin Kosek wrote: > > On 09/05/2012 08:06 PM, Rob Crittenden wrote: > >> Rob Crittenden wrote: > >>> Martin Kosek wrote: > On 07/05/2012 08:39 PM, Rob Crittenden wrote: > > Martin Kosek wrote: > >> On 07/03/2012 04:41 PM, Rob Crittenden wrote: > >>> Deleting a replica can leave a replication vector (RUV) on the > >>> other servers. > >>> This can confuse things if the replica is re-added, and it > >>> also > >>> causes the > >>> server to calculate changes against a server that may no > >>> longer > >>> exist. > >>> > >>> 389-ds-base provides a new task that self-propogates itself > >>> to all > >>> available > >>> replicas to clean this RUV data. > >>> > >>> This patch will create this task at deletion time to hopefully > >>> clean things up. > >>> > >>> It isn't perfect. If any replica is down or unavailable at the > >>> time > >>> the > >>> cleanruv task fires, and then comes back up, the old RUV data > >>> may be > >>> re-propogated around. > >>> > >>> To make things easier in this case I've added two new > >>> commands to > >>> ipa-replica-manage. The first lists the replication ids of > >>> all the > >>> servers we > >>> have a RUV for. Using this you can call clean_ruv with the > >>> replication id of a > >>> server that no longer exists to try the cleanallruv step > >>> again. > >>> > >>> This is quite dangerous though. If you run cleanruv against a > >>> replica id that > >>> does exist it can cause a loss of data. I believe I've put in > >>> enough scary > >>> warnings about this. > >>> > >>> rob > >>> > >> Good work there, this should make cleaning RUVs much easier > >> than > >> with the > >> previous version. > >> > >> This is what I found during review: > >> > >> 1) list_ruv and clean_ruv command help in man is quite lost. I > >> think > >> it would > >> help if we for example have all info for commands indented. > >> This > >> way > >> user could > >> simply over-look the new commands in the man page. > >> > >> > >> 2) I would rename new commands to clean-ruv and list-ruv to > >> make > >> them > >> consistent with the rest of the commands (re-initialize, > >> force-sync). > >> > >> > >> 3) It would be nice to be able to run clean_ruv command in an > >> unattended way > >> (for better testing), i.e. respect --force option as we already > >> do for > >> ipa-replica-manage del. This fix would aid test automation in > >> the > >> future. > >> > >> > >> 4) (minor) The new question (and the del too) does not react > >> too > >> well for > >> CTRL+D: > >> > >> # ipa-replica-manage clean_ruv 3 --force > >> Clean the Replication Update Vector for > >> vm-055.idm.lab.bos.redhat.com:389 > >> > >> Cleaning the wrong replica ID will cause that server to no > >> longer replicate so it may miss updates while the process > >> is running. It would need to be re-initialized to maintain > >> consistency. Be very careful. > >> Continue to clean? [no]: unexpected error: > >> > >> > >> 5) Help for clean_ruv command without a required parameter is > >> quite > >> confusing > >> as it reports that command is wrong and not the parameter: > >> > >> # ipa-replica-manage clean_ruv > >> Usage: ipa-replica-manage [options] > >> > >> ipa-replica-manage: error: must provide a command [clean_ruv | > >> force-sync | > >> disc
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/06/2012 12:27 PM, Martin Kosek wrote: On 09/06/2012 06:13 PM, Rich Megginson wrote: On 09/06/2012 10:09 AM, Martin Kosek wrote: On 09/06/2012 06:09 PM, Martin Kosek wrote: On 09/06/2012 06:05 PM, Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/06/2012 10:40 AM, Mark Reynolds wrote: On 09/06/2012 12:13 PM, Rich Megginson wrote: On 09/06/2012 10:09 AM, Martin Kosek wrote: On 09/06/2012 06:09 PM, Martin Kosek wrote: On 09/06/2012 06:05 PM, Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Th
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/06/2012 12:13 PM, Rich Megginson wrote: On 09/06/2012 10:09 AM, Martin Kosek wrote: On 09/06/2012 06:09 PM, Martin Kosek wrote: On 09/06/2012 06:05 PM, Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue wh
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/06/2012 06:13 PM, Rich Megginson wrote: > On 09/06/2012 10:09 AM, Martin Kosek wrote: >> On 09/06/2012 06:09 PM, Martin Kosek wrote: >>> On 09/06/2012 06:05 PM, Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: > Rob Crittenden wrote: >> Rob Crittenden wrote: >>> Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: > Rob Crittenden wrote: >> Martin Kosek wrote: >>> On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: > On 07/03/2012 04:41 PM, Rob Crittenden wrote: >> Deleting a replica can leave a replication vector (RUV) on the >> other servers. >> This can confuse things if the replica is re-added, and it also >> causes the >> server to calculate changes against a server that may no longer >> exist. >> >> 389-ds-base provides a new task that self-propogates itself to >> all >> available >> replicas to clean this RUV data. >> >> This patch will create this task at deletion time to hopefully >> clean things up. >> >> It isn't perfect. If any replica is down or unavailable at the >> time >> the >> cleanruv task fires, and then comes back up, the old RUV data >> may be >> re-propogated around. >> >> To make things easier in this case I've added two new commands to >> ipa-replica-manage. The first lists the replication ids of all >> the >> servers we >> have a RUV for. Using this you can call clean_ruv with the >> replication id of a >> server that no longer exists to try the cleanallruv step again. >> >> This is quite dangerous though. If you run cleanruv against a >> replica id that >> does exist it can cause a loss of data. I believe I've put in >> enough scary >> warnings about this. >> >> rob >> > Good work there, this should make cleaning RUVs much easier than > with the > previous version. > > This is what I found during review: > > 1) list_ruv and clean_ruv command help in man is quite lost. I > think > it would > help if we for example have all info for commands indented. This > way > user could > simply over-look the new commands in the man page. > > > 2) I would rename new commands to clean-ruv and list-ruv to make > them > consistent with the rest of the commands (re-initialize, > force-sync). > > > 3) It would be nice to be able to run clean_ruv command in an > unattended way > (for better testing), i.e. respect --force option as we already > do for > ipa-replica-manage del. This fix would aid test automation in the > future. > > > 4) (minor) The new question (and the del too) does not react too > well for > CTRL+D: > > # ipa-replica-manage clean_ruv 3 --force > Clean the Replication Update Vector for > vm-055.idm.lab.bos.redhat.com:389 > > Cleaning the wrong replica ID will cause that server to no > longer replicate so it may miss updates while the process > is running. It would need to be re-initialized to maintain > consistency. Be very careful. > Continue to clean? [no]: unexpected error: > > > 5) Help for clean_ruv command without a required parameter is > quite > confusing > as it reports that command is wrong and not the parameter: > > # ipa-replica-manage clean_ruv > Usage: ipa-replica-manage [options] > > ipa-replica-manage: error: must provide a command [clean_ruv | > force-sync | > disconnect | connect | del | re-initialize | list | list_ruv] > > It seems you just forgot to specify the error message in the > command > definition > > > 6) When the remote replica is down, the clean_ruv command fails > with an > unexpected error: > > [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 > Clean the Replication Update Vector for > vm-055.idm.lab.bos.redhat.com:389 > > Cleaning the wrong replica ID wil
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/06/2012 10:09 AM, Martin Kosek wrote: On 09/06/2012 06:09 PM, Martin Kosek wrote: On 09/06/2012 06:05 PM, Martin Kosek wrote: On 09/06/2012 05:55 PM, Rob Crittenden wrote: Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force Dir
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/06/2012 06:09 PM, Martin Kosek wrote: > On 09/06/2012 06:05 PM, Martin Kosek wrote: >> On 09/06/2012 05:55 PM, Rob Crittenden wrote: >>> Rob Crittenden wrote: Rob Crittenden wrote: > Martin Kosek wrote: >> On 09/05/2012 08:06 PM, Rob Crittenden wrote: >>> Rob Crittenden wrote: Martin Kosek wrote: > On 07/05/2012 08:39 PM, Rob Crittenden wrote: >> Martin Kosek wrote: >>> On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob >>> >>> Good work there, this should make cleaning RUVs much easier than >>> with the >>> previous version. >>> >>> This is what I found during review: >>> >>> 1) list_ruv and clean_ruv command help in man is quite lost. I >>> think >>> it would >>> help if we for example have all info for commands indented. This >>> way >>> user could >>> simply over-look the new commands in the man page. >>> >>> >>> 2) I would rename new commands to clean-ruv and list-ruv to make >>> them >>> consistent with the rest of the commands (re-initialize, >>> force-sync). >>> >>> >>> 3) It would be nice to be able to run clean_ruv command in an >>> unattended way >>> (for better testing), i.e. respect --force option as we already >>> do for >>> ipa-replica-manage del. This fix would aid test automation in the >>> future. >>> >>> >>> 4) (minor) The new question (and the del too) does not react too >>> well for >>> CTRL+D: >>> >>> # ipa-replica-manage clean_ruv 3 --force >>> Clean the Replication Update Vector for >>> vm-055.idm.lab.bos.redhat.com:389 >>> >>> Cleaning the wrong replica ID will cause that server to no >>> longer replicate so it may miss updates while the process >>> is running. It would need to be re-initialized to maintain >>> consistency. Be very careful. >>> Continue to clean? [no]: unexpected error: >>> >>> >>> 5) Help for clean_ruv command without a required parameter is quite >>> confusing >>> as it reports that command is wrong and not the parameter: >>> >>> # ipa-replica-manage clean_ruv >>> Usage: ipa-replica-manage [options] >>> >>> ipa-replica-manage: error: must provide a command [clean_ruv | >>> force-sync | >>> disconnect | connect | del | re-initialize | list | list_ruv] >>> >>> It seems you just forgot to specify the error message in the >>> command >>> definition >>> >>> >>> 6) When the remote replica is down, the clean_ruv command fails >>> with an >>> unexpected error: >>> >>> [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 >>> Clean the Replication Update Vector for >>> vm-055.idm.lab.bos.redhat.com:389 >>> >>> Cleaning the wrong replica ID will cause that server to no >>> longer replicate so it may miss updates while the process >>> is running. It would need to be re-initialized to maintain >>> consistency. Be very careful. >>> Continue to clean? [no]: y >>> unexpected error: {'desc': 'Operations error'} >>> >>> >>> /var/log/dirsrv/sla
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/06/2012 06:05 PM, Martin Kosek wrote: > On 09/06/2012 05:55 PM, Rob Crittenden wrote: >> Rob Crittenden wrote: >>> Rob Crittenden wrote: Martin Kosek wrote: > On 09/05/2012 08:06 PM, Rob Crittenden wrote: >> Rob Crittenden wrote: >>> Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: > Martin Kosek wrote: >> On 07/03/2012 04:41 PM, Rob Crittenden wrote: >>> Deleting a replica can leave a replication vector (RUV) on the >>> other servers. >>> This can confuse things if the replica is re-added, and it also >>> causes the >>> server to calculate changes against a server that may no longer >>> exist. >>> >>> 389-ds-base provides a new task that self-propogates itself to all >>> available >>> replicas to clean this RUV data. >>> >>> This patch will create this task at deletion time to hopefully >>> clean things up. >>> >>> It isn't perfect. If any replica is down or unavailable at the >>> time >>> the >>> cleanruv task fires, and then comes back up, the old RUV data >>> may be >>> re-propogated around. >>> >>> To make things easier in this case I've added two new commands to >>> ipa-replica-manage. The first lists the replication ids of all the >>> servers we >>> have a RUV for. Using this you can call clean_ruv with the >>> replication id of a >>> server that no longer exists to try the cleanallruv step again. >>> >>> This is quite dangerous though. If you run cleanruv against a >>> replica id that >>> does exist it can cause a loss of data. I believe I've put in >>> enough scary >>> warnings about this. >>> >>> rob >>> >> >> Good work there, this should make cleaning RUVs much easier than >> with the >> previous version. >> >> This is what I found during review: >> >> 1) list_ruv and clean_ruv command help in man is quite lost. I >> think >> it would >> help if we for example have all info for commands indented. This >> way >> user could >> simply over-look the new commands in the man page. >> >> >> 2) I would rename new commands to clean-ruv and list-ruv to make >> them >> consistent with the rest of the commands (re-initialize, >> force-sync). >> >> >> 3) It would be nice to be able to run clean_ruv command in an >> unattended way >> (for better testing), i.e. respect --force option as we already >> do for >> ipa-replica-manage del. This fix would aid test automation in the >> future. >> >> >> 4) (minor) The new question (and the del too) does not react too >> well for >> CTRL+D: >> >> # ipa-replica-manage clean_ruv 3 --force >> Clean the Replication Update Vector for >> vm-055.idm.lab.bos.redhat.com:389 >> >> Cleaning the wrong replica ID will cause that server to no >> longer replicate so it may miss updates while the process >> is running. It would need to be re-initialized to maintain >> consistency. Be very careful. >> Continue to clean? [no]: unexpected error: >> >> >> 5) Help for clean_ruv command without a required parameter is quite >> confusing >> as it reports that command is wrong and not the parameter: >> >> # ipa-replica-manage clean_ruv >> Usage: ipa-replica-manage [options] >> >> ipa-replica-manage: error: must provide a command [clean_ruv | >> force-sync | >> disconnect | connect | del | re-initialize | list | list_ruv] >> >> It seems you just forgot to specify the error message in the >> command >> definition >> >> >> 6) When the remote replica is down, the clean_ruv command fails >> with an >> unexpected error: >> >> [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 >> Clean the Replication Update Vector for >> vm-055.idm.lab.bos.redhat.com:389 >> >> Cleaning the wrong replica ID will cause that server to no >> longer replicate so it may miss updates while the process >> is running. It would need to be re-initialized to maintain >> consistency. Be very careful. >> Continue to clean? [no]: y >> unexpected error: {'desc': 'Operations error'} >> >> >> /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: >> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - >> cleanAllRUV_task: failed >> to connect to repl
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/06/2012 05:55 PM, Rob Crittenden wrote: > Rob Crittenden wrote: >> Rob Crittenden wrote: >>> Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: > Rob Crittenden wrote: >> Martin Kosek wrote: >>> On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: > On 07/03/2012 04:41 PM, Rob Crittenden wrote: >> Deleting a replica can leave a replication vector (RUV) on the >> other servers. >> This can confuse things if the replica is re-added, and it also >> causes the >> server to calculate changes against a server that may no longer >> exist. >> >> 389-ds-base provides a new task that self-propogates itself to all >> available >> replicas to clean this RUV data. >> >> This patch will create this task at deletion time to hopefully >> clean things up. >> >> It isn't perfect. If any replica is down or unavailable at the >> time >> the >> cleanruv task fires, and then comes back up, the old RUV data >> may be >> re-propogated around. >> >> To make things easier in this case I've added two new commands to >> ipa-replica-manage. The first lists the replication ids of all the >> servers we >> have a RUV for. Using this you can call clean_ruv with the >> replication id of a >> server that no longer exists to try the cleanallruv step again. >> >> This is quite dangerous though. If you run cleanruv against a >> replica id that >> does exist it can cause a loss of data. I believe I've put in >> enough scary >> warnings about this. >> >> rob >> > > Good work there, this should make cleaning RUVs much easier than > with the > previous version. > > This is what I found during review: > > 1) list_ruv and clean_ruv command help in man is quite lost. I > think > it would > help if we for example have all info for commands indented. This > way > user could > simply over-look the new commands in the man page. > > > 2) I would rename new commands to clean-ruv and list-ruv to make > them > consistent with the rest of the commands (re-initialize, > force-sync). > > > 3) It would be nice to be able to run clean_ruv command in an > unattended way > (for better testing), i.e. respect --force option as we already > do for > ipa-replica-manage del. This fix would aid test automation in the > future. > > > 4) (minor) The new question (and the del too) does not react too > well for > CTRL+D: > > # ipa-replica-manage clean_ruv 3 --force > Clean the Replication Update Vector for > vm-055.idm.lab.bos.redhat.com:389 > > Cleaning the wrong replica ID will cause that server to no > longer replicate so it may miss updates while the process > is running. It would need to be re-initialized to maintain > consistency. Be very careful. > Continue to clean? [no]: unexpected error: > > > 5) Help for clean_ruv command without a required parameter is quite > confusing > as it reports that command is wrong and not the parameter: > > # ipa-replica-manage clean_ruv > Usage: ipa-replica-manage [options] > > ipa-replica-manage: error: must provide a command [clean_ruv | > force-sync | > disconnect | connect | del | re-initialize | list | list_ruv] > > It seems you just forgot to specify the error message in the > command > definition > > > 6) When the remote replica is down, the clean_ruv command fails > with an > unexpected error: > > [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 > Clean the Replication Update Vector for > vm-055.idm.lab.bos.redhat.com:389 > > Cleaning the wrong replica ID will cause that server to no > longer replicate so it may miss updates while the process > is running. It would need to be re-initialized to maintain > consistency. Be very careful. > Continue to clean? [no]: y > unexpected error: {'desc': 'Operations error'} > > > /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: > [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - > cleanAllRUV_task: failed > to connect to replagreement connection > (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, > > cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force Directory Manager password: Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal Failed to get data from 'vm-055.idm.lab.bos.redhat.com': {'desc': "Can't conta
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Rob Crittenden wrote: Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force Directory Manager password: Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal Failed to get data from 'vm-055.idm.lab.bos.redhat.com': {'desc': "Can't contact LDAP server"} Forcing
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Martin Kosek wrote: On 09/05/2012 08:06 PM, Rob Crittenden wrote: Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force Directory Manager password: Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal Failed to get data from 'vm-055.idm.lab.bos.redhat.com': {'desc': "Can't contact LDAP server"} Forcing removal on 'vm-086.idm.la
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 09/05/2012 08:06 PM, Rob Crittenden wrote: > Rob Crittenden wrote: >> Martin Kosek wrote: >>> On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: > On 07/03/2012 04:41 PM, Rob Crittenden wrote: >> Deleting a replica can leave a replication vector (RUV) on the >> other servers. >> This can confuse things if the replica is re-added, and it also >> causes the >> server to calculate changes against a server that may no longer exist. >> >> 389-ds-base provides a new task that self-propogates itself to all >> available >> replicas to clean this RUV data. >> >> This patch will create this task at deletion time to hopefully >> clean things up. >> >> It isn't perfect. If any replica is down or unavailable at the time >> the >> cleanruv task fires, and then comes back up, the old RUV data may be >> re-propogated around. >> >> To make things easier in this case I've added two new commands to >> ipa-replica-manage. The first lists the replication ids of all the >> servers we >> have a RUV for. Using this you can call clean_ruv with the >> replication id of a >> server that no longer exists to try the cleanallruv step again. >> >> This is quite dangerous though. If you run cleanruv against a >> replica id that >> does exist it can cause a loss of data. I believe I've put in >> enough scary >> warnings about this. >> >> rob >> > > Good work there, this should make cleaning RUVs much easier than > with the > previous version. > > This is what I found during review: > > 1) list_ruv and clean_ruv command help in man is quite lost. I think > it would > help if we for example have all info for commands indented. This way > user could > simply over-look the new commands in the man page. > > > 2) I would rename new commands to clean-ruv and list-ruv to make them > consistent with the rest of the commands (re-initialize, force-sync). > > > 3) It would be nice to be able to run clean_ruv command in an > unattended way > (for better testing), i.e. respect --force option as we already do for > ipa-replica-manage del. This fix would aid test automation in the > future. > > > 4) (minor) The new question (and the del too) does not react too > well for > CTRL+D: > > # ipa-replica-manage clean_ruv 3 --force > Clean the Replication Update Vector for > vm-055.idm.lab.bos.redhat.com:389 > > Cleaning the wrong replica ID will cause that server to no > longer replicate so it may miss updates while the process > is running. It would need to be re-initialized to maintain > consistency. Be very careful. > Continue to clean? [no]: unexpected error: > > > 5) Help for clean_ruv command without a required parameter is quite > confusing > as it reports that command is wrong and not the parameter: > > # ipa-replica-manage clean_ruv > Usage: ipa-replica-manage [options] > > ipa-replica-manage: error: must provide a command [clean_ruv | > force-sync | > disconnect | connect | del | re-initialize | list | list_ruv] > > It seems you just forgot to specify the error message in the command > definition > > > 6) When the remote replica is down, the clean_ruv command fails with an > unexpected error: > > [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 > Clean the Replication Update Vector for > vm-055.idm.lab.bos.redhat.com:389 > > Cleaning the wrong replica ID will cause that server to no > longer replicate so it may miss updates while the process > is running. It would need to be re-initialized to maintain > consistency. Be very careful. > Continue to clean? [no]: y > unexpected error: {'desc': 'Operations error'} > > > /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: > [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - > cleanAllRUV_task: failed > to connect to replagreement connection > (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, > > cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping > tree,cn=config), error 105 > [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - > cleanAllRUV_task: replica > (cn=meTovm-055.idm.lab. > bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping > > > > tree, cn=config) has not been cleaned. You will need to rerun the > CLEANALLRUV task on this replica. > [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - > cleanAllRUV_task: Task > failed (1) > > In this case I think we should inform user that the command failed, > possibly > because of disconnected replicas and that they could enable the > replicas and > try
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Rob Crittenden wrote: Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force Directory Manager password: Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal Failed to get data from 'vm-055.idm.lab.bos.redhat.com': {'desc': "Can't contact LDAP server"} Forcing removal on 'vm-086.idm.lab.bos.redhat.com' There were issues removing a connection: %d format
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Martin Kosek wrote: On 07/05/2012 08:39 PM, Rob Crittenden wrote: Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force Directory Manager password: Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal Failed to get data from 'vm-055.idm.lab.bos.redhat.com': {'desc': "Can't contact LDAP server"} Forcing removal on 'vm-086.idm.lab.bos.redhat.com' There were issues removing a connection: %d format: a number is requi
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 07/05/2012 08:39 PM, Rob Crittenden wrote: > Martin Kosek wrote: >> On 07/03/2012 04:41 PM, Rob Crittenden wrote: >>> Deleting a replica can leave a replication vector (RUV) on the other >>> servers. >>> This can confuse things if the replica is re-added, and it also causes the >>> server to calculate changes against a server that may no longer exist. >>> >>> 389-ds-base provides a new task that self-propogates itself to all available >>> replicas to clean this RUV data. >>> >>> This patch will create this task at deletion time to hopefully clean things >>> up. >>> >>> It isn't perfect. If any replica is down or unavailable at the time the >>> cleanruv task fires, and then comes back up, the old RUV data may be >>> re-propogated around. >>> >>> To make things easier in this case I've added two new commands to >>> ipa-replica-manage. The first lists the replication ids of all the servers >>> we >>> have a RUV for. Using this you can call clean_ruv with the replication id >>> of a >>> server that no longer exists to try the cleanallruv step again. >>> >>> This is quite dangerous though. If you run cleanruv against a replica id >>> that >>> does exist it can cause a loss of data. I believe I've put in enough scary >>> warnings about this. >>> >>> rob >>> >> >> Good work there, this should make cleaning RUVs much easier than with the >> previous version. >> >> This is what I found during review: >> >> 1) list_ruv and clean_ruv command help in man is quite lost. I think it would >> help if we for example have all info for commands indented. This way user >> could >> simply over-look the new commands in the man page. >> >> >> 2) I would rename new commands to clean-ruv and list-ruv to make them >> consistent with the rest of the commands (re-initialize, force-sync). >> >> >> 3) It would be nice to be able to run clean_ruv command in an unattended way >> (for better testing), i.e. respect --force option as we already do for >> ipa-replica-manage del. This fix would aid test automation in the future. >> >> >> 4) (minor) The new question (and the del too) does not react too well for >> CTRL+D: >> >> # ipa-replica-manage clean_ruv 3 --force >> Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 >> >> Cleaning the wrong replica ID will cause that server to no >> longer replicate so it may miss updates while the process >> is running. It would need to be re-initialized to maintain >> consistency. Be very careful. >> Continue to clean? [no]: unexpected error: >> >> >> 5) Help for clean_ruv command without a required parameter is quite confusing >> as it reports that command is wrong and not the parameter: >> >> # ipa-replica-manage clean_ruv >> Usage: ipa-replica-manage [options] >> >> ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | >> disconnect | connect | del | re-initialize | list | list_ruv] >> >> It seems you just forgot to specify the error message in the command >> definition >> >> >> 6) When the remote replica is down, the clean_ruv command fails with an >> unexpected error: >> >> [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 >> Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 >> >> Cleaning the wrong replica ID will cause that server to no >> longer replicate so it may miss updates while the process >> is running. It would need to be re-initialized to maintain >> consistency. Be very careful. >> Continue to clean? [no]: y >> unexpected error: {'desc': 'Operations error'} >> >> >> /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: >> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed >> to connect to replagreement connection >> (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, >> cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping >> tree,cn=config), error 105 >> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: >> replica >> (cn=meTovm-055.idm.lab. >> bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping >> >> tree, cn=config) has not been cleaned. You will need to rerun the >> CLEANALLRUV task on this replica. >> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task >> failed (1) >> >> In this case I think we should inform user that the command failed, possibly >> because of disconnected replicas and that they could enable the replicas and >> try again. >> >> >> 7) (minor) "pass" is now redundant in replication.py: >> +except ldap.INSUFFICIENT_ACCESS: >> +# We can't make the server we're removing read-only but >> +# this isn't a show-stopper >> +root_logger.debug("No permission to switch replica to read-only, >> continuing anyway") >> +pass >> > > I think this addresses everything. > > rob Thanks, almost there! I just found one more issue which needs to be fixed before we push: # ipa-replica-manage del vm-055.idm.lab.
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
Martin Kosek wrote: On 07/03/2012 04:41 PM, Rob Crittenden wrote: Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass I think this addresses everything. rob >From a092bc35cebae12591600451ede25511818e8a85 Mon Sep 17 00:00:00 2001 From: Rob Crittenden Date: Wed, 27 Jun 2012 14:51:45 -0400 Subject: [PATCH] Run the CLEANALLRUV task when deleting a replication agreement. https://fedorahosted.org/freeipa/ticket/2303 --- install/tools/ipa-replica-manage | 105 +++- install/tools/man/ipa-replica-manage.1 | 17 +- ipaserver/install/replication.py | 29 + 3 files changed, 148 insertions(+), 3 deletions(-) diff --git a/install/tools/ipa-replica
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
On 07/03/2012 04:41 PM, Rob Crittenden wrote: > Deleting a replica can leave a replication vector (RUV) on the other servers. > This can confuse things if the replica is re-added, and it also causes the > server to calculate changes against a server that may no longer exist. > > 389-ds-base provides a new task that self-propogates itself to all available > replicas to clean this RUV data. > > This patch will create this task at deletion time to hopefully clean things > up. > > It isn't perfect. If any replica is down or unavailable at the time the > cleanruv task fires, and then comes back up, the old RUV data may be > re-propogated around. > > To make things easier in this case I've added two new commands to > ipa-replica-manage. The first lists the replication ids of all the servers we > have a RUV for. Using this you can call clean_ruv with the replication id of a > server that no longer exists to try the cleanallruv step again. > > This is quite dangerous though. If you run cleanruv against a replica id that > does exist it can cause a loss of data. I believe I've put in enough scary > warnings about this. > > rob > Good work there, this should make cleaning RUVs much easier than with the previous version. This is what I found during review: 1) list_ruv and clean_ruv command help in man is quite lost. I think it would help if we for example have all info for commands indented. This way user could simply over-look the new commands in the man page. 2) I would rename new commands to clean-ruv and list-ruv to make them consistent with the rest of the commands (re-initialize, force-sync). 3) It would be nice to be able to run clean_ruv command in an unattended way (for better testing), i.e. respect --force option as we already do for ipa-replica-manage del. This fix would aid test automation in the future. 4) (minor) The new question (and the del too) does not react too well for CTRL+D: # ipa-replica-manage clean_ruv 3 --force Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: 5) Help for clean_ruv command without a required parameter is quite confusing as it reports that command is wrong and not the parameter: # ipa-replica-manage clean_ruv Usage: ipa-replica-manage [options] ipa-replica-manage: error: must provide a command [clean_ruv | force-sync | disconnect | connect | del | re-initialize | list | list_ruv] It seems you just forgot to specify the error message in the command definition 6) When the remote replica is down, the clean_ruv command fails with an unexpected error: [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y unexpected error: {'desc': 'Operations error'} /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: failed to connect to replagreement connection (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree,cn=config), error 105 [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: replica (cn=meTovm-055.idm.lab. bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping tree, cn=config) has not been cleaned. You will need to rerun the CLEANALLRUV task on this replica. [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - cleanAllRUV_task: Task failed (1) In this case I think we should inform user that the command failed, possibly because of disconnected replicas and that they could enable the replicas and try again. 7) (minor) "pass" is now redundant in replication.py: +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug("No permission to switch replica to read-only, continuing anyway") +pass Martin ___ Freeipa-devel mailing list Freeipa-devel@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-devel
[Freeipa-devel] [PATCH] 1031 run cleanallruv task
Deleting a replica can leave a replication vector (RUV) on the other servers. This can confuse things if the replica is re-added, and it also causes the server to calculate changes against a server that may no longer exist. 389-ds-base provides a new task that self-propogates itself to all available replicas to clean this RUV data. This patch will create this task at deletion time to hopefully clean things up. It isn't perfect. If any replica is down or unavailable at the time the cleanruv task fires, and then comes back up, the old RUV data may be re-propogated around. To make things easier in this case I've added two new commands to ipa-replica-manage. The first lists the replication ids of all the servers we have a RUV for. Using this you can call clean_ruv with the replication id of a server that no longer exists to try the cleanallruv step again. This is quite dangerous though. If you run cleanruv against a replica id that does exist it can cause a loss of data. I believe I've put in enough scary warnings about this. rob >From e5a5b19b64e1b81ce560cc5b1edd540b9920a928 Mon Sep 17 00:00:00 2001 From: Rob Crittenden Date: Wed, 27 Jun 2012 14:51:45 -0400 Subject: [PATCH] Run the CLEANALLRUV task when deleting a replication agreement. https://fedorahosted.org/freeipa/ticket/2303 --- install/tools/ipa-replica-manage | 85 +++- install/tools/man/ipa-replica-manage.1 | 17 +++ ipaserver/install/replication.py | 28 +++ 3 files changed, 129 insertions(+), 1 deletion(-) diff --git a/install/tools/ipa-replica-manage b/install/tools/ipa-replica-manage index e2378173821457ed05dae273d148266ef822..a72b04a2e1676f0a8008e3181025e53e241d522c 100755 --- a/install/tools/ipa-replica-manage +++ b/install/tools/ipa-replica-manage @@ -22,6 +22,7 @@ import os import ldap, re, krbV import traceback +from urllib2 import urlparse from ipapython import ipautil from ipaserver.install import replication, dsinstance, installutils @@ -37,6 +38,7 @@ CACERT = "/etc/ipa/ca.crt" # dict of command name and tuples of min/max num of args needed commands = { "list":(0, 1, "[master fqdn]", ""), +"list_ruv":(0, 0, "", ""), "connect":(1, 2, " [other master fqdn]", "must provide the name of the servers to connect"), "disconnect":(1, 2, " [other master fqdn]", @@ -44,7 +46,8 @@ commands = { "del":(1, 1, "", "must provide hostname of master to delete"), "re-initialize":(0, 0, "", ""), -"force-sync":(0, 0, "", "") +"force-sync":(0, 0, "", ""), +"clean_ruv":(1, 1, "Replica ID of to clean", ""), } def parse_options(): @@ -229,6 +232,7 @@ def del_link(realm, replica1, replica2, dirman_passwd, force=False): if repl2 and type1 == replication.IPA_REPLICA: failed = False try: +repl2.make_readonly() repl2.delete_agreement(replica1) repl2.delete_referral(replica1) except ldap.LDAPError, e: @@ -251,6 +255,7 @@ def del_link(realm, replica1, replica2, dirman_passwd, force=False): repl1.delete_agreement(replica2) repl1.delete_referral(replica2) +repl1.cleanallruv(repl2._get_replica_id(repl2.conn, None)) if type1 == replication.WINSYNC: try: @@ -268,6 +273,80 @@ def del_link(realm, replica1, replica2, dirman_passwd, force=False): print "Deleted replication agreement from '%s' to '%s'" % (replica1, replica2) +def get_ruv(realm, host, dirman_passwd): +""" +Return the RUV entries as a list of tuples: (hostname, rid) +""" +try: +thisrepl = replication.ReplicationManager(realm, host, dirman_passwd) +except Exception, e: +print "Failed to connect to server %s: %s" % (host, str(e)) +sys.exit(1) + +search_filter = '(&(nsuniqueid=---)(objectclass=nstombstone))' +try: +entries = thisrepl.conn.search_s(api.env.basedn, ldap.SCOPE_ONELEVEL, +search_filter, ['nsds50ruv']) +except ldap.NO_SUCH_OBJECT: +print "No RUV records found." +sys.exit(0) + +servers = [] +for ruv in entries[0].getValues('nsds50ruv'): +if ruv.startswith('{replicageneration'): +continue +data = re.match('\{replica (\d+) (ldap://.*:\d+)\}\s+\w+\s+\w*', ruv) +if data: +rid = data.group(1) +(scheme, netloc, path, params, query, fragment) = urlparse.urlparse(data.group(2)) +servers.append((netloc, rid)) +else: +print "unable to decode: %s" % ruv + +return servers + +def list_ruv(realm, host, dirman_passwd, verbose): +""" +List the Replica Update Vectors on this host to get the available +replica IDs. +""" +servers = get_ruv(realm, host, dirman_passwd) +for (netloc, rid) in servers: +print "%s: %s" % (netloc, rid) + +def clean_ruv(realm, ruv, options): +""" +