Re: [sidr] WGLC for draft-ietf-sidr-algorithm-agility-03

Eric Osterweil Mon, 07 Nov 2011 14:45:17 -0800

On Nov 7, 2011, at 2:46 PM, Stephen Kent wrote:

>> ...
>> 
>> I can appreciate that this document represents some long standing thought 
>> and effort.  However, the fact that I believe there is a flaw does not seem 
>> to need the support of an alternate design, right?  I'm pointing out an 
>> operational misalignment in _this_ design.  I think to offer an alternative 
>> at the same time as we are discussing a shortcoming here would be an 
>> inappropriate conflation (i.e. I think that would confuse this issue with 
>> another).
> 
> The authors do not agree that the global coordination requirement is a flaw.
> 
>> So, more specifically: I think that trying to mandate global coordination at 
>> this scale is an operational non-starter.  Why can't the design be made to 
>> accommodate different choices of algorithms and different operational 
>> schedules?  I think this is actually a requirement: that operational 
>> entities be able to choose their own schedules and make their own 
>> configuration choices.
> 
> If there is not a schedule when old algs die and new ones MUST be supported, 
> then one at least doubles the size of the repository system, and imposes a 
> burden on all CAs and RPs to support old algs forever.


Sorry, but I think freedom from global coordination is more than an 
inconvenient, or unpalatable, concept.  It is something that (afaict) has been 
an inherent requirement in those Internet-scale systems that have survived to 
date.  I don't think there is any precedent of an operational system of this 
scale that requires this level of global coordination of its configuration, is 
there?  What Internet-scale system of a scale remotely as large as BGP has this 
kind of global coordination model?  What is the evidence that such an approach 
will work here?

I think this may be suggesting that more requirements analysis is needed before 
we can have total faith in the design work.  If this particular design 
implicitly requires something that seems to be a non-starter, then I would say 
that seriously undermines the design and more design work is likely required.  
I have a specific example below...

> 
>>> 2- Not exactly. The milestones, as well as the alg suite spec, will appear 
>>> in a revised version of draft-ietf-sidr-rpki-algs. Any operational problem 
>>> that requires a delay in any transition phase would be brought to the 
>>> attention of the IESG (if the SIDR WG is no longer active) requesting that 
>>> a this RFC be re-issued, with new milestone values for the affected 
>>> phase(s).
>> 
>> I'm sorry, but I really think this is likely to have trouble in a real 
>> operational setting.  I don't think anyone would claim that the IETF's 
>> processes operate at the same pace as operations.  For instance, if there is 
>> an emergency at the last minute of this roll, can the working group be 
>> expected to mint a new RFC and disseminate in short order (say, days)?  
>> There is a vey fundamental misalignment here: creating standards and 
>> managing operations are very loosely coupled.  I think this is a very 
>> inappropriate place to try to enforce operational schedules.
> 
> I think you overstate the problem. The intervals for each phase are not 
> expected to be short, and there are phases that accommodate both old and new 
> als in a fashion that allows considerable CA and RP flexibility.
> 
> Nonetheless, I think Terry's suggestion has merit. I can imagine having the 
> milestone RFC be coordinated through the NRO and IANA, and published by the 
> IETF, to help ensure that there is appropriate ISP input to the milestone
> development.

Sorry, but this really misses the issue I was trying to describe.  Suppose you 
have a very well-planned / longterm schedule in place.  As you march towards 
one of these cutover dates, suppose there is any operational problem (hardware 
failure, software failure, network partitioning of part of the system, newly 
discovered problems with the plan, etc).  At the "last minute" at least one 
operational body cannot meet the deadline.  How do you even securely verify 
that an operator needs to stop (as opposed to an impostor trying to halt 
things)? Then what? A new RFC in the 11th hour?  How is this overstating the 
problem?  There are two very different processes at play here: standards and 
policies get hammered out in their own time, operations happen whenever a 
problem feels like rearing its head.  I really think that conflating the two is 
a non-starter because they are governed by fundamentally different things, and 
so are their schedules.  Honestly, if someone needs to call a scre
 eching halt to a rollover, how should that happen just days or hours before an 
RFC's deadline?  I think this is complicated by the fact that even if the NRO 
or IANA or anyone else actually can mint an RFC in a matter of hours (is that 
possible?), how would you guarantee that anyone else trying to respect the 
original RFC date even knows of the change?  Since global coordination is a 
requirement, they all have to be in sync either way, right?  I suppose if some 
portion of the operators don't get the message and start the rollover anyway, 
then we have another problem, yes?

Eric

_______________________________________________
sidr mailing list
sidr@ietf.org
https://www.ietf.org/mailman/listinfo/sidr

Re: [sidr] WGLC for draft-ietf-sidr-algorithm-agility-03

Reply via email to