Re: [Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-09-17 Thread Martin Kosek
On 09/17/2012 04:06 PM, Martin Kosek wrote:
> On 09/14/2012 09:16 PM, Rob Crittenden wrote:
>> Martin Kosek wrote:
>>> On 09/10/2012 08:34 PM, Rob Crittenden wrote:
 Martin Kosek wrote:
> On Thu, 2012-09-06 at 17:22 -0400, Rob Crittenden wrote:
>> Martin Kosek wrote:
>>> On 08/31/2012 07:40 PM, Rob Crittenden wrote:
 Rob Crittenden wrote:
> It was possible use ipa-replica-manage connect/disconnect/del to end 
> up
> orphaning or or more IPA masters. This is an attempt to catch and
> prevent that case.
>
> I tested with this topology, trying to delete B.
>
> A <-> B <-> C
>
> I got here by creating B and C from A, connecting B to C then deleting
> the link from A to B, so it went from A -> B and A -> C to the above.
>
> What I do is look up the servers that the delete candidate host has
> connections to and see if we're the last link.
>
> I added an escape clause if there are only two masters.
>
> rob

 Oh, this relies on my cleanruv patch 1031.

 rob

>>>
>>> 1) When I run ipa-replica-manage del --force to an already uninstalled 
>>> host,
>>> the new code will prevent me the deletation because it cannot connect to
>>> it. It
>>> also crashes with UnboundLocalError:
>>>
>>> # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force
>>>
>>> Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing 
>>> removal
>>> Traceback (most recent call last):
>>>  File "/sbin/ipa-replica-manage", line 708, in 
>>>main()
>>>  File "/sbin/ipa-replica-manage", line 677, in main
>>>del_master(realm, args[1], options)
>>>  File "/sbin/ipa-replica-manage", line 476, in del_master
>>>sys.exit("Failed read master data from '%s': %s" % 
>>> (delrepl.hostname,
>>> str(e)))
>>> UnboundLocalError: local variable 'delrepl' referenced before assignment
>>
>> Fixed.
>>
>>>
>>>
>>> I also hit this error when removing a winsync replica.
>>
>> Fixed.
>>
>>>
>>>
>>> 2) As I wrote before, I think having --force option override the user
>>> inquiries
>>> would benefit test automation:
>>>
>>> +if not ipautil.user_input("Continue to delete?", False):
>>> +sys.exit("Aborted")
>>
>> Fixed.
>>
>>>
>>>
>>> 3) I don't think this code won't cover this topology:
>>>
>>> A - B - C - D - E
>>>
>>> It would allow you deleting a replica C even though it would separate 
>>> A-B
>>> and
>>> D-E. Though we may not want to cover this situation now, what you got is
>>> definitely helping.
>>
>> I think you may be right. I only tested with 4 servers. With this B and
>> D would both still have 2 agreements so wouldn't be covered by the last
>> link test.
>
> Everything looks good now, so ACK. We just need to push it along with
> CLEANALLRUV patch.
>
> Martin
>

 Not to look a gift ACK In the mouth but here is a revised patch. I've 
 added a
 cleanup routine to remove an orphaned master properly. If you had tried the
 mechanism I outlined in the man page it would have worked but was
 less-than-complete. This way is better, just don't try it on a live master.

 I also added a cleanruv abort command, in case you want to kill an existing
 task.

 rob

>>>
>>> 1) CLEANRUV stuff should be in your patch 1031 and not here (but I will 
>>> comment
>>> on the code in this mail since it is in this patch)
>>>
>>>
>>> 2) In new command defitinion:
>>>
>>> +"abort-clean-ruv":(1, 1, "Replica ID of to abort cleaning", ""),
>>>
>>> I miss error message in case REPLICA_ID is not passed, this way error 
>>> message
>>> when I omit the parameter is confusing:
>>>
>>> # ipa-replica-manage abort-clean-ruv
>>> Usage: ipa-replica-manage [options]
>>>
>>> ipa-replica-manage: error: must provide a command [force-sync | clean-ruv |
>>> disconnect | connect | list-ruv | del | re-initialize | list | 
>>> abort-clean-ruv]
>>>
>>> On another note, I am thinking about the new command(s). Since we now have
>>> abort-clean-ruv command, we may want to also implement list-clean-ruv 
>>> commands
>>> in the future to see all CLEANALLRUV commands in process... Or we may 
>>> enhance
>>> list-ruv to write a flag like [CLEANALLRUV in process] for RUV's for which
>>> CLEANALLRUV task is in process.
>>>
>>>
>>> 3) Will clean-ruv - abort-clean-ruv - clean-ruv sequence work? If the 
>>> aborted
>>> CLEANALLRUV task stays in DIT, we may not be able to enter new CLEANALLRUV 
>>> task
>>> because we always use the same DN...
>>>
>>> Btw. did abort CLEANALLRUV command worked for you? Mine seemed to be stuck 
>>> on
>>> replicas t

Re: [Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-09-17 Thread Martin Kosek
On 09/14/2012 09:16 PM, Rob Crittenden wrote:
> Martin Kosek wrote:
>> On 09/10/2012 08:34 PM, Rob Crittenden wrote:
>>> Martin Kosek wrote:
 On Thu, 2012-09-06 at 17:22 -0400, Rob Crittenden wrote:
> Martin Kosek wrote:
>> On 08/31/2012 07:40 PM, Rob Crittenden wrote:
>>> Rob Crittenden wrote:
 It was possible use ipa-replica-manage connect/disconnect/del to end up
 orphaning or or more IPA masters. This is an attempt to catch and
 prevent that case.

 I tested with this topology, trying to delete B.

 A <-> B <-> C

 I got here by creating B and C from A, connecting B to C then deleting
 the link from A to B, so it went from A -> B and A -> C to the above.

 What I do is look up the servers that the delete candidate host has
 connections to and see if we're the last link.

 I added an escape clause if there are only two masters.

 rob
>>>
>>> Oh, this relies on my cleanruv patch 1031.
>>>
>>> rob
>>>
>>
>> 1) When I run ipa-replica-manage del --force to an already uninstalled 
>> host,
>> the new code will prevent me the deletation because it cannot connect to
>> it. It
>> also crashes with UnboundLocalError:
>>
>> # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force
>>
>> Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing 
>> removal
>> Traceback (most recent call last):
>>  File "/sbin/ipa-replica-manage", line 708, in 
>>main()
>>  File "/sbin/ipa-replica-manage", line 677, in main
>>del_master(realm, args[1], options)
>>  File "/sbin/ipa-replica-manage", line 476, in del_master
>>sys.exit("Failed read master data from '%s': %s" % 
>> (delrepl.hostname,
>> str(e)))
>> UnboundLocalError: local variable 'delrepl' referenced before assignment
>
> Fixed.
>
>>
>>
>> I also hit this error when removing a winsync replica.
>
> Fixed.
>
>>
>>
>> 2) As I wrote before, I think having --force option override the user
>> inquiries
>> would benefit test automation:
>>
>> +if not ipautil.user_input("Continue to delete?", False):
>> +sys.exit("Aborted")
>
> Fixed.
>
>>
>>
>> 3) I don't think this code won't cover this topology:
>>
>> A - B - C - D - E
>>
>> It would allow you deleting a replica C even though it would separate A-B
>> and
>> D-E. Though we may not want to cover this situation now, what you got is
>> definitely helping.
>
> I think you may be right. I only tested with 4 servers. With this B and
> D would both still have 2 agreements so wouldn't be covered by the last
> link test.

 Everything looks good now, so ACK. We just need to push it along with
 CLEANALLRUV patch.

 Martin

>>>
>>> Not to look a gift ACK In the mouth but here is a revised patch. I've added 
>>> a
>>> cleanup routine to remove an orphaned master properly. If you had tried the
>>> mechanism I outlined in the man page it would have worked but was
>>> less-than-complete. This way is better, just don't try it on a live master.
>>>
>>> I also added a cleanruv abort command, in case you want to kill an existing
>>> task.
>>>
>>> rob
>>>
>>
>> 1) CLEANRUV stuff should be in your patch 1031 and not here (but I will 
>> comment
>> on the code in this mail since it is in this patch)
>>
>>
>> 2) In new command defitinion:
>>
>> +"abort-clean-ruv":(1, 1, "Replica ID of to abort cleaning", ""),
>>
>> I miss error message in case REPLICA_ID is not passed, this way error message
>> when I omit the parameter is confusing:
>>
>> # ipa-replica-manage abort-clean-ruv
>> Usage: ipa-replica-manage [options]
>>
>> ipa-replica-manage: error: must provide a command [force-sync | clean-ruv |
>> disconnect | connect | list-ruv | del | re-initialize | list | 
>> abort-clean-ruv]
>>
>> On another note, I am thinking about the new command(s). Since we now have
>> abort-clean-ruv command, we may want to also implement list-clean-ruv 
>> commands
>> in the future to see all CLEANALLRUV commands in process... Or we may enhance
>> list-ruv to write a flag like [CLEANALLRUV in process] for RUV's for which
>> CLEANALLRUV task is in process.
>>
>>
>> 3) Will clean-ruv - abort-clean-ruv - clean-ruv sequence work? If the aborted
>> CLEANALLRUV task stays in DIT, we may not be able to enter new CLEANALLRUV 
>> task
>> because we always use the same DN...
>>
>> Btw. did abort CLEANALLRUV command worked for you? Mine seemed to be stuck on
>> replicas that are down just like CLEANALLRUV command. I had then paralelly
>> running CLEANALLRUV and ABORT CLEANALLRUV running for the same RUV ID. Then, 
>> it
>> is unclear to me what this command is actually good for.
>>
>>
>> 4)

Re: [Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-09-14 Thread Rob Crittenden

Martin Kosek wrote:

On 09/10/2012 08:34 PM, Rob Crittenden wrote:

Martin Kosek wrote:

On Thu, 2012-09-06 at 17:22 -0400, Rob Crittenden wrote:

Martin Kosek wrote:

On 08/31/2012 07:40 PM, Rob Crittenden wrote:

Rob Crittenden wrote:

It was possible use ipa-replica-manage connect/disconnect/del to end up
orphaning or or more IPA masters. This is an attempt to catch and
prevent that case.

I tested with this topology, trying to delete B.

A <-> B <-> C

I got here by creating B and C from A, connecting B to C then deleting
the link from A to B, so it went from A -> B and A -> C to the above.

What I do is look up the servers that the delete candidate host has
connections to and see if we're the last link.

I added an escape clause if there are only two masters.

rob


Oh, this relies on my cleanruv patch 1031.

rob



1) When I run ipa-replica-manage del --force to an already uninstalled host,
the new code will prevent me the deletation because it cannot connect to
it. It
also crashes with UnboundLocalError:

# ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force

Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal
Traceback (most recent call last):
 File "/sbin/ipa-replica-manage", line 708, in 
   main()
 File "/sbin/ipa-replica-manage", line 677, in main
   del_master(realm, args[1], options)
 File "/sbin/ipa-replica-manage", line 476, in del_master
   sys.exit("Failed read master data from '%s': %s" % (delrepl.hostname,
str(e)))
UnboundLocalError: local variable 'delrepl' referenced before assignment


Fixed.




I also hit this error when removing a winsync replica.


Fixed.




2) As I wrote before, I think having --force option override the user
inquiries
would benefit test automation:

+if not ipautil.user_input("Continue to delete?", False):
+sys.exit("Aborted")


Fixed.




3) I don't think this code won't cover this topology:

A - B - C - D - E

It would allow you deleting a replica C even though it would separate A-B and
D-E. Though we may not want to cover this situation now, what you got is
definitely helping.


I think you may be right. I only tested with 4 servers. With this B and
D would both still have 2 agreements so wouldn't be covered by the last
link test.


Everything looks good now, so ACK. We just need to push it along with
CLEANALLRUV patch.

Martin



Not to look a gift ACK In the mouth but here is a revised patch. I've added a
cleanup routine to remove an orphaned master properly. If you had tried the
mechanism I outlined in the man page it would have worked but was
less-than-complete. This way is better, just don't try it on a live master.

I also added a cleanruv abort command, in case you want to kill an existing 
task.

rob



1) CLEANRUV stuff should be in your patch 1031 and not here (but I will comment
on the code in this mail since it is in this patch)


2) In new command defitinion:

+"abort-clean-ruv":(1, 1, "Replica ID of to abort cleaning", ""),

I miss error message in case REPLICA_ID is not passed, this way error message
when I omit the parameter is confusing:

# ipa-replica-manage abort-clean-ruv
Usage: ipa-replica-manage [options]

ipa-replica-manage: error: must provide a command [force-sync | clean-ruv |
disconnect | connect | list-ruv | del | re-initialize | list | abort-clean-ruv]

On another note, I am thinking about the new command(s). Since we now have
abort-clean-ruv command, we may want to also implement list-clean-ruv commands
in the future to see all CLEANALLRUV commands in process... Or we may enhance
list-ruv to write a flag like [CLEANALLRUV in process] for RUV's for which
CLEANALLRUV task is in process.


3) Will clean-ruv - abort-clean-ruv - clean-ruv sequence work? If the aborted
CLEANALLRUV task stays in DIT, we may not be able to enter new CLEANALLRUV task
because we always use the same DN...

Btw. did abort CLEANALLRUV command worked for you? Mine seemed to be stuck on
replicas that are down just like CLEANALLRUV command. I had then paralelly
running CLEANALLRUV and ABORT CLEANALLRUV running for the same RUV ID. Then, it
is unclear to me what this command is actually good for.


4) Man page now contains non-existent command:

--- a/install/tools/man/ipa-replica-manage.1
+++ b/install/tools/man/ipa-replica-manage.1
@@ -42,12 +42,18 @@ Manages the replication agreements of an IPA server.
  \fBforce\-sync\fR
  \- Immediately flush any data to be replicated from a server specified with
the \-\-from option
  .TP
+\fBcleanup\fR
+\- Remove leftover references to a deleted master.
+.TP


The cleanup procedure was implemented rather as an option to the del command
than a separate command.


5) My favorite - new cleanup procedure user prompt miss the --force option
useful for test automation

+if not ipautil.user_input("Continue to clean master?", False):
+sys.exit("Cleanup aborted")


Otherwise the patch looks good.

Martin



I pulle

Re: [Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-09-14 Thread Martin Kosek
On 09/10/2012 08:34 PM, Rob Crittenden wrote:
> Martin Kosek wrote:
>> On Thu, 2012-09-06 at 17:22 -0400, Rob Crittenden wrote:
>>> Martin Kosek wrote:
 On 08/31/2012 07:40 PM, Rob Crittenden wrote:
> Rob Crittenden wrote:
>> It was possible use ipa-replica-manage connect/disconnect/del to end up
>> orphaning or or more IPA masters. This is an attempt to catch and
>> prevent that case.
>>
>> I tested with this topology, trying to delete B.
>>
>> A <-> B <-> C
>>
>> I got here by creating B and C from A, connecting B to C then deleting
>> the link from A to B, so it went from A -> B and A -> C to the above.
>>
>> What I do is look up the servers that the delete candidate host has
>> connections to and see if we're the last link.
>>
>> I added an escape clause if there are only two masters.
>>
>> rob
>
> Oh, this relies on my cleanruv patch 1031.
>
> rob
>

 1) When I run ipa-replica-manage del --force to an already uninstalled 
 host,
 the new code will prevent me the deletation because it cannot connect to
 it. It
 also crashes with UnboundLocalError:

 # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force

 Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal
 Traceback (most recent call last):
 File "/sbin/ipa-replica-manage", line 708, in 
   main()
 File "/sbin/ipa-replica-manage", line 677, in main
   del_master(realm, args[1], options)
 File "/sbin/ipa-replica-manage", line 476, in del_master
   sys.exit("Failed read master data from '%s': %s" % (delrepl.hostname,
 str(e)))
 UnboundLocalError: local variable 'delrepl' referenced before assignment
>>>
>>> Fixed.
>>>


 I also hit this error when removing a winsync replica.
>>>
>>> Fixed.
>>>


 2) As I wrote before, I think having --force option override the user
 inquiries
 would benefit test automation:

 +if not ipautil.user_input("Continue to delete?", False):
 +sys.exit("Aborted")
>>>
>>> Fixed.
>>>


 3) I don't think this code won't cover this topology:

 A - B - C - D - E

 It would allow you deleting a replica C even though it would separate A-B 
 and
 D-E. Though we may not want to cover this situation now, what you got is
 definitely helping.
>>>
>>> I think you may be right. I only tested with 4 servers. With this B and
>>> D would both still have 2 agreements so wouldn't be covered by the last
>>> link test.
>>
>> Everything looks good now, so ACK. We just need to push it along with
>> CLEANALLRUV patch.
>>
>> Martin
>>
> 
> Not to look a gift ACK In the mouth but here is a revised patch. I've added a
> cleanup routine to remove an orphaned master properly. If you had tried the
> mechanism I outlined in the man page it would have worked but was
> less-than-complete. This way is better, just don't try it on a live master.
> 
> I also added a cleanruv abort command, in case you want to kill an existing 
> task.
> 
> rob
> 

1) CLEANRUV stuff should be in your patch 1031 and not here (but I will comment
on the code in this mail since it is in this patch)


2) In new command defitinion:

+"abort-clean-ruv":(1, 1, "Replica ID of to abort cleaning", ""),

I miss error message in case REPLICA_ID is not passed, this way error message
when I omit the parameter is confusing:

# ipa-replica-manage abort-clean-ruv
Usage: ipa-replica-manage [options]

ipa-replica-manage: error: must provide a command [force-sync | clean-ruv |
disconnect | connect | list-ruv | del | re-initialize | list | abort-clean-ruv]

On another note, I am thinking about the new command(s). Since we now have
abort-clean-ruv command, we may want to also implement list-clean-ruv commands
in the future to see all CLEANALLRUV commands in process... Or we may enhance
list-ruv to write a flag like [CLEANALLRUV in process] for RUV's for which
CLEANALLRUV task is in process.


3) Will clean-ruv - abort-clean-ruv - clean-ruv sequence work? If the aborted
CLEANALLRUV task stays in DIT, we may not be able to enter new CLEANALLRUV task
because we always use the same DN...

Btw. did abort CLEANALLRUV command worked for you? Mine seemed to be stuck on
replicas that are down just like CLEANALLRUV command. I had then paralelly
running CLEANALLRUV and ABORT CLEANALLRUV running for the same RUV ID. Then, it
is unclear to me what this command is actually good for.


4) Man page now contains non-existent command:

--- a/install/tools/man/ipa-replica-manage.1
+++ b/install/tools/man/ipa-replica-manage.1
@@ -42,12 +42,18 @@ Manages the replication agreements of an IPA server.
 \fBforce\-sync\fR
 \- Immediately flush any data to be replicated from a server specified with
the \-\-from option
 .TP
+\fBcleanup\fR
+\- Remove leftover references to a deleted master.
+.TP

Re: [Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-09-10 Thread Rob Crittenden

Martin Kosek wrote:

On Thu, 2012-09-06 at 17:22 -0400, Rob Crittenden wrote:

Martin Kosek wrote:

On 08/31/2012 07:40 PM, Rob Crittenden wrote:

Rob Crittenden wrote:

It was possible use ipa-replica-manage connect/disconnect/del to end up
orphaning or or more IPA masters. This is an attempt to catch and
prevent that case.

I tested with this topology, trying to delete B.

A <-> B <-> C

I got here by creating B and C from A, connecting B to C then deleting
the link from A to B, so it went from A -> B and A -> C to the above.

What I do is look up the servers that the delete candidate host has
connections to and see if we're the last link.

I added an escape clause if there are only two masters.

rob


Oh, this relies on my cleanruv patch 1031.

rob



1) When I run ipa-replica-manage del --force to an already uninstalled host,
the new code will prevent me the deletation because it cannot connect to it. It
also crashes with UnboundLocalError:

# ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force

Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal
Traceback (most recent call last):
File "/sbin/ipa-replica-manage", line 708, in 
  main()
File "/sbin/ipa-replica-manage", line 677, in main
  del_master(realm, args[1], options)
File "/sbin/ipa-replica-manage", line 476, in del_master
  sys.exit("Failed read master data from '%s': %s" % (delrepl.hostname, 
str(e)))
UnboundLocalError: local variable 'delrepl' referenced before assignment


Fixed.




I also hit this error when removing a winsync replica.


Fixed.




2) As I wrote before, I think having --force option override the user inquiries
would benefit test automation:

+if not ipautil.user_input("Continue to delete?", False):
+sys.exit("Aborted")


Fixed.




3) I don't think this code won't cover this topology:

A - B - C - D - E

It would allow you deleting a replica C even though it would separate A-B and
D-E. Though we may not want to cover this situation now, what you got is
definitely helping.


I think you may be right. I only tested with 4 servers. With this B and
D would both still have 2 agreements so wouldn't be covered by the last
link test.


Everything looks good now, so ACK. We just need to push it along with
CLEANALLRUV patch.

Martin



Not to look a gift ACK In the mouth but here is a revised patch. I've 
added a cleanup routine to remove an orphaned master properly. If you 
had tried the mechanism I outlined in the man page it would have worked 
but was less-than-complete. This way is better, just don't try it on a 
live master.


I also added a cleanruv abort command, in case you want to kill an 
existing task.


rob

>From a8718439ac1f51d857a03ff27776850324c43cda Mon Sep 17 00:00:00 2001
From: Rob Crittenden 
Date: Fri, 31 Aug 2012 11:56:58 -0400
Subject: [PATCH] When deleting a master, try to prevent orphaning other
 servers.

If you have a replication topology like A <-> B <-> C and you try
to delete server B that will leave A and C orphaned. It may also
prevent re-installation of a new master on B because the cn=masters
entry for it probably still exists on at least one of the other masters.

Check on each master that it connects to to ensure that it isn't the
last link, and fail if it is. If any of the masters are not up then
warn that this could be a bad thing but let the user continue if
they want.

Add a new option to the del command, --cleanup, which runs the
replica_cleanup() routine to completely clean up references to a master.

Add a command to run the abort-clean-ruv task.

https://fedorahosted.org/freeipa/ticket/2797
---
 install/tools/ipa-replica-manage   | 115 -
 install/tools/man/ipa-replica-manage.1 |  17 +
 ipaserver/install/replication.py   |  22 +++
 3 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/install/tools/ipa-replica-manage b/install/tools/ipa-replica-manage
index c6ef51b7215164c9538afae942e3d42285ca860b..60d8ecbc22bfde66042b9ff4ea1068ad79566ed1 100755
--- a/install/tools/ipa-replica-manage
+++ b/install/tools/ipa-replica-manage
@@ -49,6 +49,7 @@ commands = {
 "re-initialize":(0, 0, "", ""),
 "force-sync":(0, 0, "", ""),
 "clean-ruv":(1, 1, "Replica ID of to clean", ""),
+"abort-clean-ruv":(1, 1, "Replica ID of to abort cleaning", ""),
 }
 
 def convert_error(exc):
@@ -70,6 +71,8 @@ def parse_options():
   help="provide additional information")
 parser.add_option("-f", "--force", dest="force", action="store_true", default=False,
   help="ignore some types of errors")
+parser.add_option("-c", "--cleanup", dest="cleanup", action="store_true", default=False,
+  help="DANGER: clean up references to a ghost master")
 parser.add_option("--binddn", dest="binddn", default=None, type="dn",
   help="Bind DN to use with remote server")
 parser.add_option("--

Re: [Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-09-07 Thread Martin Kosek
On Thu, 2012-09-06 at 17:22 -0400, Rob Crittenden wrote:
> Martin Kosek wrote:
> > On 08/31/2012 07:40 PM, Rob Crittenden wrote:
> >> Rob Crittenden wrote:
> >>> It was possible use ipa-replica-manage connect/disconnect/del to end up
> >>> orphaning or or more IPA masters. This is an attempt to catch and
> >>> prevent that case.
> >>>
> >>> I tested with this topology, trying to delete B.
> >>>
> >>> A <-> B <-> C
> >>>
> >>> I got here by creating B and C from A, connecting B to C then deleting
> >>> the link from A to B, so it went from A -> B and A -> C to the above.
> >>>
> >>> What I do is look up the servers that the delete candidate host has
> >>> connections to and see if we're the last link.
> >>>
> >>> I added an escape clause if there are only two masters.
> >>>
> >>> rob
> >>
> >> Oh, this relies on my cleanruv patch 1031.
> >>
> >> rob
> >>
> >
> > 1) When I run ipa-replica-manage del --force to an already uninstalled host,
> > the new code will prevent me the deletation because it cannot connect to 
> > it. It
> > also crashes with UnboundLocalError:
> >
> > # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force
> >
> > Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal
> > Traceback (most recent call last):
> >File "/sbin/ipa-replica-manage", line 708, in 
> >  main()
> >File "/sbin/ipa-replica-manage", line 677, in main
> >  del_master(realm, args[1], options)
> >File "/sbin/ipa-replica-manage", line 476, in del_master
> >  sys.exit("Failed read master data from '%s': %s" % (delrepl.hostname, 
> > str(e)))
> > UnboundLocalError: local variable 'delrepl' referenced before assignment
> 
> Fixed.
> 
> >
> >
> > I also hit this error when removing a winsync replica.
> 
> Fixed.
> 
> >
> >
> > 2) As I wrote before, I think having --force option override the user 
> > inquiries
> > would benefit test automation:
> >
> > +if not ipautil.user_input("Continue to delete?", False):
> > +sys.exit("Aborted")
> 
> Fixed.
> 
> >
> >
> > 3) I don't think this code won't cover this topology:
> >
> > A - B - C - D - E
> >
> > It would allow you deleting a replica C even though it would separate A-B 
> > and
> > D-E. Though we may not want to cover this situation now, what you got is
> > definitely helping.
> 
> I think you may be right. I only tested with 4 servers. With this B and 
> D would both still have 2 agreements so wouldn't be covered by the last 
> link test.

Everything looks good now, so ACK. We just need to push it along with
CLEANALLRUV patch.

Martin

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-09-06 Thread Rob Crittenden

Martin Kosek wrote:

On 08/31/2012 07:40 PM, Rob Crittenden wrote:

Rob Crittenden wrote:

It was possible use ipa-replica-manage connect/disconnect/del to end up
orphaning or or more IPA masters. This is an attempt to catch and
prevent that case.

I tested with this topology, trying to delete B.

A <-> B <-> C

I got here by creating B and C from A, connecting B to C then deleting
the link from A to B, so it went from A -> B and A -> C to the above.

What I do is look up the servers that the delete candidate host has
connections to and see if we're the last link.

I added an escape clause if there are only two masters.

rob


Oh, this relies on my cleanruv patch 1031.

rob



1) When I run ipa-replica-manage del --force to an already uninstalled host,
the new code will prevent me the deletation because it cannot connect to it. It
also crashes with UnboundLocalError:

# ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force

Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal
Traceback (most recent call last):
   File "/sbin/ipa-replica-manage", line 708, in 
 main()
   File "/sbin/ipa-replica-manage", line 677, in main
 del_master(realm, args[1], options)
   File "/sbin/ipa-replica-manage", line 476, in del_master
 sys.exit("Failed read master data from '%s': %s" % (delrepl.hostname, 
str(e)))
UnboundLocalError: local variable 'delrepl' referenced before assignment


Fixed.




I also hit this error when removing a winsync replica.


Fixed.




2) As I wrote before, I think having --force option override the user inquiries
would benefit test automation:

+if not ipautil.user_input("Continue to delete?", False):
+sys.exit("Aborted")


Fixed.




3) I don't think this code won't cover this topology:

A - B - C - D - E

It would allow you deleting a replica C even though it would separate A-B and
D-E. Though we may not want to cover this situation now, what you got is
definitely helping.


I think you may be right. I only tested with 4 servers. With this B and 
D would both still have 2 agreements so wouldn't be covered by the last 
link test.


rob

>From 66217dd61b8271d6282eaad729c92e6bf961123a Mon Sep 17 00:00:00 2001
From: Rob Crittenden 
Date: Fri, 31 Aug 2012 11:56:58 -0400
Subject: [PATCH] When deleting a master, try to prevent orphaning other
 servers.

If you have a replication topology like A <-> B <-> C and you try
to delete server B that will leave A and C orphaned. It may also
prevent re-installation of a new master on B because the cn=masters
entry for it probably still exists on at least one of the other masters.

Check on each master that it connects to to ensure that it isn't the
last link, and fail if it is. If any of the masters are not up then
warn that this could be a bad thing but let the user continue if
they want.

Document how to remove a cn=masters entry in the man page.

https://fedorahosted.org/freeipa/ticket/2797
---
 install/tools/ipa-replica-manage   | 66 ++
 install/tools/man/ipa-replica-manage.1 | 12 +++
 2 files changed, 78 insertions(+)

diff --git a/install/tools/ipa-replica-manage b/install/tools/ipa-replica-manage
index c6ef51b7215164c9538afae942e3d42285ca860b..24a33bfb5ea51035eb12eaf7944a7e566640c2ff 100755
--- a/install/tools/ipa-replica-manage
+++ b/install/tools/ipa-replica-manage
@@ -398,9 +398,52 @@ def clean_ruv(realm, ruv, options):
 thisrepl.cleanallruv(ruv)
 print "Cleanup task created"
 
+def check_last_link(delrepl, realm, dirman_passwd, force):
+"""
+We don't want to orphan a server when deleting another one. If you have
+A topology that looks like this:
+
+ A B
+ | |
+ | |
+ | |
+ C D
+
+If we try to delete host D it will orphan host B.
+
+What we need to do is if the master being deleted has only a single
+agreement, connect to that master and make sure it has agreements with
+more than just this master.
+
+@delrepl: a ReplicationManager object of the master being deleted
+
+returns: hostname of orphaned server or None
+"""
+replica_names = delrepl.find_ipa_replication_agreements()
+
+orphaned = []
+# Connect to each remote server and see what agreements it has
+for replica in replica_names:
+try:
+repl = replication.ReplicationManager(realm, replica, dirman_passwd)
+except ldap.SERVER_DOWN, e:
+print "Unable to validate that '%s' will not be orphaned." % replica
+if not force and not ipautil.user_input("Continue to delete?", False):
+sys.exit("Aborted")
+continue
+names = repl.find_ipa_replication_agreements()
+if len(names) == 1 and names[0] == delrepl.hostname:
+orphaned.append(replica)
+
+if len(orphaned):
+return ', '.join(orphaned)
+else:
+return None
+
 def del_master(r

Re: [Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-09-06 Thread Martin Kosek
On 08/31/2012 07:40 PM, Rob Crittenden wrote:
> Rob Crittenden wrote:
>> It was possible use ipa-replica-manage connect/disconnect/del to end up
>> orphaning or or more IPA masters. This is an attempt to catch and
>> prevent that case.
>>
>> I tested with this topology, trying to delete B.
>>
>> A <-> B <-> C
>>
>> I got here by creating B and C from A, connecting B to C then deleting
>> the link from A to B, so it went from A -> B and A -> C to the above.
>>
>> What I do is look up the servers that the delete candidate host has
>> connections to and see if we're the last link.
>>
>> I added an escape clause if there are only two masters.
>>
>> rob
> 
> Oh, this relies on my cleanruv patch 1031.
> 
> rob
> 

1) When I run ipa-replica-manage del --force to an already uninstalled host,
the new code will prevent me the deletation because it cannot connect to it. It
also crashes with UnboundLocalError:

# ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force

Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing removal
Traceback (most recent call last):
  File "/sbin/ipa-replica-manage", line 708, in 
main()
  File "/sbin/ipa-replica-manage", line 677, in main
del_master(realm, args[1], options)
  File "/sbin/ipa-replica-manage", line 476, in del_master
sys.exit("Failed read master data from '%s': %s" % (delrepl.hostname, 
str(e)))
UnboundLocalError: local variable 'delrepl' referenced before assignment


I also hit this error when removing a winsync replica.


2) As I wrote before, I think having --force option override the user inquiries
would benefit test automation:

+if not ipautil.user_input("Continue to delete?", False):
+sys.exit("Aborted")


3) I don't think this code won't cover this topology:

A - B - C - D - E

It would allow you deleting a replica C even though it would separate A-B and
D-E. Though we may not want to cover this situation now, what you got is
definitely helping.

Martin

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-08-31 Thread Rob Crittenden

Rob Crittenden wrote:

It was possible use ipa-replica-manage connect/disconnect/del to end up
orphaning or or more IPA masters. This is an attempt to catch and
prevent that case.

I tested with this topology, trying to delete B.

A <-> B <-> C

I got here by creating B and C from A, connecting B to C then deleting
the link from A to B, so it went from A -> B and A -> C to the above.

What I do is look up the servers that the delete candidate host has
connections to and see if we're the last link.

I added an escape clause if there are only two masters.

rob


Oh, this relies on my cleanruv patch 1031.

rob

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


[Freeipa-devel] [PATCH] 1050 prevent replica orphans

2012-08-31 Thread Rob Crittenden
It was possible use ipa-replica-manage connect/disconnect/del to end up 
orphaning or or more IPA masters. This is an attempt to catch and 
prevent that case.


I tested with this topology, trying to delete B.

A <-> B <-> C

I got here by creating B and C from A, connecting B to C then deleting 
the link from A to B, so it went from A -> B and A -> C to the above.


What I do is look up the servers that the delete candidate host has 
connections to and see if we're the last link.


I added an escape clause if there are only two masters.

rob
>From 9260967f3ef9df090df86e9705596b7cb4d322b6 Mon Sep 17 00:00:00 2001
From: Rob Crittenden 
Date: Fri, 31 Aug 2012 11:56:58 -0400
Subject: [PATCH] When deleting a master, try to prevent orphaning other
 servers.

If you have a replication topology like A <-> B <-> C and you try
to delete server B that will leave A and C orphaned. It may also
prevent re-installation of a new master on B because the cn=masters
entry for it probably still exists on at least one of the other masters.

Check on each master that it connects to to ensure that it isn't the
last link, and fail if it is. If any of the masters are not up then
warn that this could be a bad thing but let the user continue if
they want.

Document how to remove a cn=masters entry in the man page.

https://fedorahosted.org/freeipa/ticket/2797
---
 install/tools/ipa-replica-manage   | 57 ++
 install/tools/man/ipa-replica-manage.1 | 12 +++
 2 files changed, 69 insertions(+)

diff --git a/install/tools/ipa-replica-manage b/install/tools/ipa-replica-manage
index 5023d9a8d761943515c8753237db0ed5c42ab813..21e2868042c9adfbce5ff9ca5ebf8690ab5a76a6 100755
--- a/install/tools/ipa-replica-manage
+++ b/install/tools/ipa-replica-manage
@@ -373,6 +373,48 @@ def clean_ruv(realm, ruv, options):
 thisrepl.cleanallruv(ruv)
 print "Cleanup task created"
 
+def check_last_link(delrepl, realm, dirman_passwd):
+"""
+We don't want to orphan a server when deleting another one. If you have
+A topology that looks like this:
+
+ A B
+ | |
+ | |
+ | |
+ C D
+
+If we try to delete host D it will orphan host B.
+
+What we need to do is if the master being deleted has only a single
+agreement, connect to that master and make sure it has agreements with
+more than just this master.
+
+@delrepl: a ReplicationManager object of the master being deleted
+
+returns: hostname of orphaned server or None
+"""
+replica_names = delrepl.find_ipa_replication_agreements()
+
+orphaned = []
+# Connect to each remote server and see what agreements it has
+for replica in replica_names:
+try:
+repl = replication.ReplicationManager(realm, replica, dirman_passwd)
+except ldap.SERVER_DOWN, e:
+print "Unable to validate that '%s' will not be orphaned." % replica
+if not ipautil.user_input("Continue to delete?", False):
+sys.exit("Aborted")
+continue
+names = repl.find_ipa_replication_agreements()
+if len(names) == 1 and names[0] == delrepl.hostname:
+orphaned.append(replica)
+
+if len(orphaned):
+return ', '.join(orphaned)
+else:
+return None
+
 def del_master(realm, hostname, options):
 
 force_del = False
@@ -426,6 +468,21 @@ def del_master(realm, hostname, options):
 if not ipautil.user_input("Continue to delete?", False):
 sys.exit("Deletion aborted")
 
+# Check for orphans
+masters_dn = DN(('cn', 'masters'), ('cn', 'ipa'), ('cn', 'etc'), ipautil.realm_to_suffix(realm))
+try:
+masters = delrepl.conn.getList(masters_dn, ldap.SCOPE_ONELEVEL)
+except:
+sys.exit("Failed read master data from '%s': %s" % (delrepl.hostname, str(e)))
+# This only applies if we have more than 2 IPA servers, otherwise there is
+# no chance of an orphan.
+if len(masters) > 2:
+orphaned_server = check_last_link(delrepl, realm, options.dirman_passwd)
+if orphaned_server is not None:
+print "Deleting this server will orphan '%s'. " % orphaned_server
+print "You will need to reconfigure your replication topology to delete this server."
+sys.exit(1)
+
 # 4. Remove each agreement
 for r in replica_names:
 try:
diff --git a/install/tools/man/ipa-replica-manage.1 b/install/tools/man/ipa-replica-manage.1
index 4a1c489f33591ff6ac98fe7f9a16ebb6a52ee28a..3eeadd8d6f5af61d9890994f7cadf3acfdc2f3e0 100644
--- a/install/tools/man/ipa-replica-manage.1
+++ b/install/tools/man/ipa-replica-manage.1
@@ -59,6 +59,18 @@ Each IPA master server has a unique replication ID. This ID is used by 389\-ds\-
 When a master is removed, all other masters need to remove its replication ID from the list of masters. Normally this occurs automatically when a master is deleted with ipa\