[Pce] High level PCEP Review

Adrian Farrel Tue, 14 Nov 2006 10:24:19 -0800

Hi,

As discussed, it is time to review PCEP from outside the designteam. JP asked (well, OK, he begged :-) that at this review I limitmyself to major protocol issues, and leave other documentationissues to a later review. I have mainly done that, but I have alsoincluded a few process-related comments that I think you should

resolve in the next revision.


Cheers,
Adrian
===
Frontpage authors list.

You need to cut this down to at most 5, but preferably no more than4 people.I guess it is time to work out who the main editors are, and makethe split.

===
Title

I don't think that naming the protocol "Version 1" is appropriate.It is almost a statement of intended failure!

It is still OK to have the protocol version field.
===
Spelling

If you could run the text through a spell-checker, it would reallyimprove the readability and hence the quality of review.

===
"PCEP protocol"
Could you fix that rather annoying phrase!
===

Section 2 needs to include a statement of intent (or otherwise) tosatisfy the requirements in RFC4657.

===
I think section 4 belongs after section 5
===

Any reason why the Open, Close, and Keepalive messages do notconform to the naming convention "PCXxxx" of all other messages?

===
Remove RFC2119 language from section 5
===
PCErr is missing from section 5.

Indeed, the use of PCNtfy as shown in figure 5 to expresscancellation of a request by a PCE would seem to be the correctusage of PCErr.I suggest a bit of a rationalisation (and perhaps the creation of anew message - PCCncl).Thus PCNtfy becomes a benign notification message, PCErr is used bythe PCE to report an error (but not a failure), PCCncl is used bythe PCC to cancel a PCReq.It is best if we can resist overloading the semantics of thedifferent messages.

===

Section 5.2.5 would benefit from a statement that it is yourintention that the TCP connection can be re-established and recoverthe context of a session without further exchange of Open messages.I think this is what you intend. I am not sure it is good, andwould prefer that the session was also re-established with theexchange of a list of PCReq I-Ds still outstanding OR that a failedTCP connection automatically cleared the session. Otherwise youcomplicate the protocol and the implementation with handlingduplicate request I-Ds and lost requests/responses.

===

In section 6 you imply that a malformed PCErr will cause thegeneration of a PCErr. I suggest you need to avoid this behavior toprevent duelling error messages!

===
Can you include a reference to the definition of BNF in section 6
===

In 6.1 you need to describe the default settings of the Flags andReserved fields

===

Do you really support 2^32 byte messages? If this is a consciousdecision, then maybe OK.

===

6.3 doesn't mention the use of keepalive to detect TCP connectionfailure. I think it should.

===

In 6.4 the End-Points object is described as mandatory, but the BNFhas it optional

===
In 6.7
 The PCErr message may be sent by a PCC or a PCE in response to a
 request or in an unsolicited manner.
"may"?
===
In the objects you define reserved fields as MUST be set to zero.
I think you need:
- SHOULD be set to zero
- MUST be ignored on receipt
===

In 7.1 it is unclear, but I think the P flag would only havemeaning on a request, and the I flag on a response.So actually, you only need one flag (or you have some clarificationtext to write)

===

You will also need to explain what it means when a PCRep has the Iflag set for an object that had the P flag set on the PCReq. Ascurrently phrased this is an encoding error that would require aPCErr, but that would clearly be a waste of time.

===
In fact, why would you ever send a PCErr in response to a PCRep?
===
At the end of 7.1 you need to explain padding rules
===

Why does the Open need a protocol version number? This is in themessage header so we all know what protocol is being used/supported.in fact, I think you need to re-write the protocol version supportprocessing and negotiation.

===
Dead timer.

Should it be "SHOULD be set to zero, and MUST be ignored if thekeepalive timer is zero"

===
The fields in the Open object are a mess!

Can you bring the flags up with the version to improve the byte-alignment?

===

Note that in the message header the version is four bits but in theOpen object it is three bits.

===
In the Open object
 SID (PCEP session-ID - 8 bits): specifies a 2 octet unsigned PCEP
 session number that identifies the current session.
Are you using bit compression?
===

Open object TLVs may be out of the scope of this doc (7.2) but theformat needs to be defined here.

===
 When present in an Open message, the OPEN object specifies the
 proposed PCEP session characteristics.  Upon receiving unacceptable
 PCEP session characteristics during the PCEP session initialization

phase, the receiving PCEP peer (PCE) MAY include a PCEP objectwithinthe PCErr message so as to propose alternative sessioncharacteristic

 values.
Isn't there an issue with duelling Open messages?

You seem to have an independent Open in each direction, each ofwhich is responded with a PCErr, leading to a new Open in eachdirection.Is it necessary to have this symmetry? After all, the TCPconnection is always initiated by the PCC, so can't we also havethe Open initiated by the PCC and responded with a PCErr or an Openby the PCE. Keepalives then follow.

===
7.3

Why is the P flag set in an RP object carried in a PCNtf or PCErr?No computation action is required.

===

Why is there a reserved piece of the RP object? Is there adifference between the unused flags bits and the reserved field?

===
7.3.1
 Note that it is not required for a PCE to
 support the priority field: in this case, it is RECOMMENDED to set
 the priority field to 0 by the PCC in the RP object.

Why would it matter what value is set if it is not supported by thePCE?Wouldn't it possibly be helpful if the PCE forwards the request toanother PCE?

===
7.3.1

As described here there is no difference between the R and the Fbit. Both are used to recompute a path allowing re-use of resourcesalready used by the path. Presumably in the case of the F bit, someinformation about the failure is also supplied through anexclusion, but this is not mentioned. The behavior at the PCE forthese two bits would seem to be identical so unless you can givebetter separation of the meanings, I suggest you merge the bits.

===
7.3.1
The description of the O-bit is very confusing.
I think the problem lies with the definition of "explicit path"
Loose means that it is acceptable to include loose hops

You have no term to distinguish the use of abstract nodes thatcontain more than one network node

Explicit just means an ERO
Strict means that only strict hops are allowed
===
7.3.1

You need to think about how the Request ID is handled when the PCCrestarts. Are you requiring the PCC to have persistent memory toavoid re-use of a Request ID? (That is how the text reads!)

===
7.3.2
The description of R-bit handling looks broken.

The previous setting of the O-bit is not important because the PCCis able to use RRO to collect information from the network. Thus,the question is actually about how detailed the RRO is. The pointyou are making remains (that it is hard to compute a proper make-before-break resource-sharing path if you don't know the previouspath. [Those of you looking at the inter-area/AS PCE problem spacewill recognise this as impossible to solve unless stateful PCEs areused or the PCE is provided with a way to issue an enquiry to theABR/ASBR.]

===
7.3.2
 If the PCC receives a PCRep message that contains a RP object
 referring to an unknown Request-ID-Number, the PCC MUST send a PCErr
 message with Error-Type="Unknown request reference".

This feels like at most a "MAY". Why is there a requirement torespond?

===
7.4

I don't think the description on the inclusion of "failed"constraints in the PCRep message is in agreement with the BNF insection 6.5. The BNF appears to imply that an ERO must be presentin order to include, for example, a Metric object.[By the way: I continue to believe that this business of reportingthe failed constraints is extremely dubious. Where there are twoabsolute constraints to be satisfied, it is often impossible tostate which of them led to the failure to compute a path.]

===
7.4

How does a PCC tell the difference between a PCRep that containsthe "failed" constraints echoed from the PCReq, and modifiedconstraints that could have been satisfied? From the text itappears that you intend the C-bit to make this distinction, but thedescription of the C-bt does not say this.

===
7.4
Wouldn't it be a good idea to put a reason code in the No-Path object?
===
7.5

Can you add a note to say where the end points of the LSP *are*encoded.That is, although this PCReq is computing a segment only, it isimportant (at least for policy) to know the LSP end points as well.

===
7.6

First paragraph is hard to parse, but I think it says that theBandwidth object must always be present in a PCReq! But the secondparagraph explains what it means if it is absent.

===
7.6
Can you give a references to the IEEE FP definition.
===
7.7
Please add a third metric: hop count

I think you need to allow the Metric object to be present multipletimes, and you have this in the text, but it is missing in the BNF.On the other hand, you need to describe how to interpret multipleMetric objects for the same Type setting

===
7.7

Why do you need the B-bit? The metric value is set to zero with thesame semantic.

===
7.7

The text appears to say that only one Metric object may be presentwith the B-flag cleared. Why?

===
7.8 and 7.9 (and everywhere)

A real nit, but when you write "ERO object" or "RRO object" do youmean EROO and RROO? :-)

===
7.8

You say that the ERO is bounded by 3209, 3473 and 3477 which isfine, but how will an MPLS-TE LSR handle an ERO with an explicitlabel in it? You also allow future extensions to subobjects.I think this means that the PCReq must include a way for therequester to limit the subobjects that will be returned.

===
7.9
RRO is also used for reporting the path of a failed LSP?
===
7.10
What are optional TLVs for?
Perhaps to carry the additional attributes as defined in RFC4420?
===
7.10

Why do you need an LSPA object without resource affinities? This ispointless. It only exists in 3209 for backwards compatibility.

===
7.10

What is the meaning of holding priority with respect to pathcomputation?

===
7.10

Why are the Flags and Reserved fields different sizes in the twoobjects?

===
7.10

When computing a path to satisfy a request with the L-bit set, howwill the PCE know which links are protected with FRR?

===
7.11
Good to see the IRO
Why is there no XRO?

Exclusion is a standard part of the MPLS-TE MIB (RFC3812 seemplsTunnelHopInclude). If the exclusion can be configured on theheadend LSR, I think we need to be able to compute the path.

Besides, this is a requirement in RC4657.
===
7.11

Why is it necessary to list the subobjects for the IRO but not forthe ERO or the RRO?

===
7.12.1
I urge caution in the use of "simultaneously" in this section.
===
7.12.1

At the end of this section, you suddenly introduce the term"bundle". I think this is supposed to refer to the presence ofmultiple requests in a single PCReq message. Note that the SVEC canapply across messages. It would, perhaps be better to explain thatrequests in the same message are not necessarily associated by anSVEC, and that requests in separate messages may be associated byan SVEC.

===
7.12

You are missing text to describe the use of the SVEC on a PCRep(allowed according to 6.5).It feels like you would only use SVEC in the case of an RP, but theBNF has them as independent.

Please clarify
===
7.12.2
You have MUST NOT for each of your diversity characteristics.

I think you also need "desired" so that you can get "as muchdiversity as possible".How will you handle the overlap between (for example) node and linkdiversity? Just needs some text to explain.

===
7.12

Is it possible to end up with a nasty mesh of synchronizedrequests? Are there any rules to resolve this?

===
10

The state machine is quite hard to parse for reviewing purposes. Itwould be good to at least name the arrows on the figure. Mypersonal preference is to draw the FSM as a grid because this showsup missing state/event combinations far more easily.


Is the OpenRetry counter just a substate of OpenWait?

The description of OpenRetry is at odds with the error used (1:5).The error says a second unacceptable Open has been received, butthe text says that OpenRetry is incremented if the Open is anythingexcept malformed. Can I suggest that the use of a separate stateand not a substate would help you resolve this type of issue.

LocalOK also looks like a substate. Ditto RemoteOK. Please don'tuse substates, they are evil! The flip flop between OpenWait andKeepWait caused by these two variables and the negotiation ofparameters is ghastly.

I think that what the state machine shows us is that we do not needindependent Open messages in each direction. Anyway, if we *do*need them, we need correct states to manage them.

===
10
 If a malformed PCEP message is received or the TCP connection fails,
 the systems sends a PCEP CLOSE message, the system releases the PCEP
 resources for that PCEP peer, closes the TCP connection and moves to
 the Idle State.

This seems to be in contradiction with text in the main body of thedraft and with the existence of error code 6.

===
App A

Is this appendix just present to help you track the requirementsyou still need to satisfy, or do you intend to leave somerequirements unanswered?

BTW, I don't think this is a full list. Route exclusions is notlisted.


What are you doing to ensure that the list is complete?
===
I think Kenji's email address may have changed
Same for Jean-Louis



_______________________________________________
Pce mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/pce

[Pce] High level PCEP Review

Reply via email to