kea-dev Digest, Vol 16, Issue 4

kea-dev-request Mon, 13 Jul 2015 10:12:07 -0700

Send kea-dev mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.isc.org/mailman/listinfo/kea-dev
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of kea-dev digest..."

Today's Topics:

   1. Re:  Decline support requirements in Kea 1.0 (Tomek Mrugalski)
   2. Re:  Decline support requirements in Kea 1.0 (Tomek Mrugalski)
   3. Re:  Decline support requirements in Kea 1.0 (Tomek Mrugalski)
   4.  Call for comments about the Decline design (Tomek Mrugalski)
   5. Re:  Call for comments about the Decline design (Marcin Siodelski)
   6. Re:  Call for comments about the Lease Expiration design
      (Tomek Mrugalski)

----------------------------------------------------------------------

Message: 1
Date: Mon, 13 Jul 2015 15:23:19 +0200
From: Tomek Mrugalski <[email protected]>
To: Thomas Markwalder <[email protected]>, [email protected]
Subject: Re: [kea-dev] Decline support requirements in Kea 1.0
Message-ID: <[email protected]>
Content-Type: text/plain; charset=windows-1252

Ok, I finally got round back to the decline stuff. See my comments below.

On 25.06.2015 18:37, Thomas Markwalder wrote:
> G3 - Stating that a declined address must be removed from the managed
> pool is a design/implementation detail.  You might reword this a "A declined
> address MUST not be assigned to a client"
Reworded as suggested.
> 
> C1 - I do not think this requirement is necessary.  G1 and G2 already
> require it be supported.  I'm not sure you gain anything by having this.
It is necessary. Let me give you an example. We do support rapid commit.
It is not enabled by default, however (and that's fine for rapid
commit). Decline should be supported out of the box, without any knobs
necessary.

> C3 - Remove the parenthetical suggestion of how-to.  This belongs in a
> design document.
Removed.

> H1 and H2 - The word "new" is unnecessary.
Removed.

> S1 - Change "being declined" to "currently in the declined state"
Changed.

> S2 - Clarify that this value resets upon server restart.  Otherwise it
> implies we must keep track across restarts forever.
Added a general note that statistics are considered runtime property and
as such, they're reset after server restart. On a related note, I'm sure
there will be people who'd like to keep their stats after restart. Until
they voice their needs, let's ignore this topic for now.

> S3 - As with S1, "being declined" to "currently in the declined state"
Updated.

Tomek

------------------------------

Message: 2
Date: Mon, 13 Jul 2015 15:26:31 +0200
From: Tomek Mrugalski <[email protected]>
To: Marcin Siodelski <[email protected]>, Thomas Markwalder
        <[email protected]>,        [email protected]
Subject: Re: [kea-dev] Decline support requirements in Kea 1.0
Message-ID: <[email protected]>
Content-Type: text/plain; charset=windows-1252

On 26.06.2015 08:10, Marcin Siodelski wrote:
> "Declined address" in the terminology - suggest replacing "unknown
> entity" with a "different entity", as "unknown" is ambiguous.
Terminology extended.

> "Declined state":  - suggest to reword to "a state in which an address
> is marked by the server as unavailable for the assignment"
Updated.

> G4 - rather than adding the explanation what it means to "recover an
> address", it would be better to add a new term in the terminology and
> use it here:
> 
> "Declined address recovery (or Address Recovery)" - a process by which
> the server marks an address in the declined state as available for
> assignment again, i.e. moves it out of the declined state.
Added.

> G6 - I think it is too much for the requirements to imply that the
> declined addresses must be kept in the database. It would be sufficient
> to say that "The information about currently declined addresses MUST not
> be lost after system restart or crash". Plus another requirement which
> would say: "There must be a mechanism by which the system administrator
> can inspect currently declined addresses".
As Thomas later suggested, I removed the reference for explicit mechanism.

> G7 - I suggest updating this requirement to say that the log message
> emitted, when the address is marked declined, includes the duration of
> time for which the address will remain in the declined state.
Reworded. It now says: "There MUST be dedicated log entries for putting
and recovering an address from the declined state. They must include the
address, the client's details and the amount of time the address being
declined will remain in the declined state."

> C2 - I suggest rewording it slightly to use the terminology: "The amount
> of time an address remains in the declined state MUST be configurable"
> 
> C3 - Similarly to C2 - It MUST be possible to configure the server to
> keep an address in the declined state until sysadmin intervention.
Both updated.

Tomek

------------------------------

Message: 3
Date: Mon, 13 Jul 2015 15:31:35 +0200
From: Tomek Mrugalski <[email protected]>
To: Thomas Markwalder <[email protected]>, Marcin Siodelski
        <[email protected]>,       [email protected]
Subject: Re: [kea-dev] Decline support requirements in Kea 1.0
Message-ID: <[email protected]>
Content-Type: text/plain; charset=windows-1252

On 26.06.2015 12:34, Thomas Markwalder wrote:
>> G6 - I think it is too much for the requirements to imply that the
>> declined addresses must be kept in the database. It would be sufficient
>> to say that "The information about currently declined addresses MUST not
>> be lost after system restart or crash".
> 
> This one could lead us down a rabbit hole:
You don't like rabbits, do you? ;)

>>  Plus another requirement which
>> would say: "There must be a mechanism by which the system administrator
>> can inspect currently declined addresses".
> Unless it is sufficient to accept the requirement as met by us documenting
> how the information can be understood by examination of the persisted
> data.   At first blush it implies we will provide some sort of tool or
> interface
> for reporting or retrieving this information.   We also made no such
> requirement
> for any other persisted information.   I would suggest we simply drop it.
Dropped as suggested. Declined addresses are somewhat different, because
there's high chance that they require manual sysadmin intervention - at
least an investigation why the duplicate address was discovered and
occasionally a witch hunt. While the server will recover from the
situation automatically, the issue will reappear if the underlying
problem (unknown device using an address without permission) is not solved.

We'll probably have this conversation in the design phase, but I agree
that dropping such a requirement would give us more design flexibility.

Tomek

------------------------------

Message: 4
Date: Mon, 13 Jul 2015 16:54:09 +0200
From: Tomek Mrugalski <[email protected]>
To: "[email protected]" <[email protected]>
Subject: [kea-dev] Call for comments about the Decline design
Message-ID: <[email protected]>
Content-Type: text/plain; charset=utf-8

Hi everyone,
As part of the Kea 1.0 preparation, I wrote a short document about our
intended design for Decline support. It is described here:

http://kea.isc.org/wiki/DeclineDesign

The major idea is to use special hardware address or duid values to
indicate a declined address and keep it in the regular lease database.
With this approach, the amount of work is greatly reduced, there is
almost no performance degradation and this approach is proven
(implemented years ago in Dibbler) to work well.

I'd like to hear your opinions on this proposal. We plan to conclude the
design discussions around end of July.

Tomek

------------------------------

Message: 5
Date: Mon, 13 Jul 2015 17:41:17 +0200
From: Marcin Siodelski <[email protected]>
To: Tomek Mrugalski <[email protected]>,   "[email protected]"
        <[email protected]>
Subject: Re: [kea-dev] Call for comments about the Decline design
Message-ID: <[email protected]>
Content-Type: text/plain; charset=windows-1252

On 13.07.2015 16:54, Tomek Mrugalski wrote:
> Hi everyone,
> As part of the Kea 1.0 preparation, I wrote a short document about our
> intended design for Decline support. It is described here:
> 
> http://kea.isc.org/wiki/DeclineDesign
> 
> The major idea is to use special hardware address or duid values to
> indicate a declined address and keep it in the regular lease database.
> With this approach, the amount of work is greatly reduced, there is
> almost no performance degradation and this approach is proven
> (implemented years ago in Dibbler) to work well.
> 
> I'd like to hear your opinions on this proposal. We plan to conclude the
> design discussions around end of July.
> 
> Tomek
> 

I had a quick look into the document. I will go review it more
extensively, but for now I want to mostly focus on one general issue.

I would like to discuss the issue of using the special HW address or
DUID to mark the lease as declined. I think that there are some flaws
which should at least be documented. When the address gets declined for
the certain client and the lease is updated to modify the HW address or
DUID, you effectively loose the information who has declined this
address. Sure, you can take it out from the logs, assuming that your
logging level is set to the threshold when it logs such a message. But,
you can't associate a declined lease with the client who actually
declined it by looking into the lease database.

Another thing is that that it may be faster for the SQL databases to
lookup leases in the database using a numeric value, rather than DUID.
So, if you want to query for all declined leases, you could collect all
expired leases by a "declined" flag rather than by the varbinary value.

I accept the argument that there will not be many declined leases,
comparing to undeclined ones, so it may be premature to optimize
queries, but one never knows if such optimizations will not be needed in
the future. Then querying for declined leases by DUID may present some
performance degradation.

At this point however, I am mostly concerned about the representation of
the declined leases in the database, which would make troubleshooting
harder than if you had an additional field which value of 1 would
indicate declined, and 0 undeclined, and didn't modify the client
identifier.

I think some discussion in this respect would be useful in the document.
If nothing else, the implementation should hide its details and it
should be trivial and localized change to migrate out from the concept
of using the DUID for marking declined leases to additional flag.
Though, if someone has already implemented the wrapper around database
to use the DUID-based queries, the migration may cause him heart attack.

Marcin

------------------------------

Message: 6
Date: Mon, 13 Jul 2015 19:11:05 +0200
From: Tomek Mrugalski <[email protected]>
To: [email protected]
Subject: Re: [kea-dev] Call for comments about the Lease Expiration
        design
Message-ID: <[email protected]>
Content-Type: text/plain; charset=windows-1252

On 06.07.2015 14:43, Marcin Siodelski wrote:
> I have put together the document
> http://kea.isc.org/wiki/LeaseExpirationDesign, which presents the design
> for the Lease Expiration in Kea 1.0.
I reviewed your proposal and let me be the first one to say: good work!
Expiration is a tricky topic and I'm sure there were temptations to do
things with threads, processes or other tricky things. We may get there
eventually, but 1.0 is not the right time to step into this muddy area. :)

Anyway, on to the details:

Lease Reclaimation Routine
Can you consider renaming reclaimLeases4() to processExpiredLeases4? It
is maybe obvious what reclaim means in the lease expiration context, but
when you look at it some time in the future, it won't be that easy to
guess what "reclaim" means. It could as well be confused as reclaim
declined leases or reclaim leases that after reconfiguration no longer
belong to a subnet. On the other hand, if the decline design is accepted
as proposed, it may reclaim also declined leases, so maybe that name is
not that bad after all?

One thing is not clear to me. In the initial implementation with a
configuration of 100 subnets, how many times the reclaimLeases4 would be
called? One or 100? While I can certainly see a benefit of having per
subnet tweaks in the future, I think the base approach should be to
reclaim leases uniformly.

On a related note, it brings us to the interesting problem of measuring
timeout. If you plan to implement only global (not per subnet)
expiration in 1.0, this discussion becomes academic and we can skip it
in 1.0 timeframe. Anyway, if you have 100 subnets and the user had
specified that expiration can take no longer than 10 milliseconds, would
the code favor expiration for lease from subnet 1 over those from subnet
100? I think the answer is "no" (which is good) if there will be one
call for all subnets. The answer would likely be "yes" if there were
separate calls for each subnet. That's another reason why getting per
subnet tweaks may fragile. The code would have to implement some sort of
round robin with memory (the last expiration processing stopped at
subnet X, so this one continues from X or X+1).

Comment for unexpected shutdown:
I think the problem with unexpected shutdown is really non-issue if you
order the steps in appropriate order. In particular, the actual lease
removal must be done after hooks and name change request is sent. If the
server experiences an unexpected shutdown, the lease will will stay in
the database. After restart a possibly extra hook call and D2 name
removal request will be issued, but that's ok. The data will be
consistent. So I agree with your comment that this aspect (not really an
issue in my opinion) should be documented. Describing it in detail in
the Guide for Hooks Developers is fine, but a brief mention in the
User's Guide is also necessary. Even without hooks, you can get
duplicate DNS update attempts (with the latter failing) if you shutdown
your server abruptly.

Discussion regarding skip flag in the lease{4,6}_expire hook. I think
the skip flag should work as you described. Whether they will "pile up"
or not really depends on the client patterns. If you have a limited
number of clients (e.g. in a company that does not permit visitors), you
may want to skip expiration as a way to provide stable addresses.
There are better ways to do it (very long lease times for example), but
there may be extra reasons (monitoring when devices are up, thus short
lifetimes and desire to keep addresses stable). Anyway, explaining this
in the hooks guide would be very useful. BTW keep in mind ticket #3499
about replacing skip flag with an enum. 1.0 is the last moment where we
can do such a change.

Sending Name Change Requests from callouts
I have mixed feelings whether that is really that useful and if it's
really part of the expiration design. The same thing could be done for
lease assignment ("callout implementor would want to send a name change
request on his own"). I'm ok keeping it as part of the design, but the
tickets associated with this particular part should be considered
optional and have low priority.

Periodically executing tasks
>From the tasks listed at the end of your design, I presume that you
favor approach 2. I like it more than approach 1, so given choice of 1
or 2, I would go with 2. There's another possible approach, though.
Let's call it approach 3. You may think of it as simplified approach 2.

Approach 3
Everything happens in the main thread. The code is allowed to create new
timers wherever needed, but it must use the IOService created in the
Daemon class. Right now the main DHCPv{4,6} loop calls
LeaseMgrFactory::instance().getIOServiceExecInterval(); to determine the
timeout, but it would call TimerMgr::getTimeout() instead. (That would
also solve the problem described in #3696). That method would go over
all registered timers an find the shortest timeout. That's not a big
deal, as I don't think we'll ever have more than 3 or 4 timers (one for
LFC, second for expiration, maybe a third one for failover and maybe one
more custom for hooks). In this approach, TimerMgr is little more than a
container for timers with couple utility methods. No WatchSocket would
be necessary.

Regarding approach 2 and 3
Do you want to register timers by string? It would be more efficient if
each timer had a unique integer value assigned upon registration. We had
similar discussion about statistics access patterns. In that case the
decision was to stick with strings as it's more natural. Also, hook
users may ask how they can ensure that their timer name is unique. A
method that returns timer id would ensure that. But I don't insist on
it, if you prefer to keep access by name pattern.

Configuration structure
- the parameter unwarned-cycles doesn't make sense, at least not in the
form you propose. Here's an example. I have the thing configured to run
expiration every minute. I also have clients that come and go and every
minute there's one lease that expires. Each expiration fully processes
all expired leases, yet after 5 minutes I start getting warnings that
something is wrong in my setup. Instead, I would put a log on INFO that
the expiration was not completed, because of allowed time budget was
reached and there are still X leases awaiting reclaimation.

- I think idle-time should be specified in seconds, not milliseconds.
Even in the most agressive setups ("I need to know about the expiration
exactly the second when it happens"), the smallest value used would be
1000. The name is also confusing, as it suggests that the server will do
expiration only if nothing else is happening. People may afraid that if
it's set to 1 second and there's a packet coming in every second the
expiration will never be triggered. Better name would be
"run-every-x-seconds", "expiration-interval" or simply "interval". The
documentation should clarify that it won't happen strictly every X
seconds, but every X+(time it took to process the last expiration cycle).

The design does not explain if it will be possible to run expiration
procedure during start-up. There should be a switch for it and the
default should be to do full expiration of all leases at startup.
That would be useful when recovering after server downtime.

New server command
You may consider renaming the command to "leases-expire" for the reasons
stated above.

Implementation tasks
I like this list. I need to update decline design with a similar one.

Extra steps for recovering declined addresses
If the Decline design I proposed gets through, there will be one extra
minor step to do. See point 4 in Implementation details here:
http://kea.isc.org/wiki/DeclineDesign.

Tomek

------------------------------

_______________________________________________
kea-dev mailing list
[email protected]
https://lists.isc.org/mailman/listinfo/kea-dev

End of kea-dev Digest, Vol 16, Issue 4
**************************************

kea-dev Digest, Vol 16, Issue 4

Reply via email to