[Pce] Re: PCE monitoring doc

dimitri papadimitriou Mon, 20 Aug 2007 00:07:19 -0700

hi,

JP Vasseur wrote:

Hi,
On Aug 16, 2007, at 4:20 AM, dimitri papadimitriou wrote:
hi j-p
as far as i remember, this doc. is still open fordiscussion from San Diego and Prague mtg discussion.
An ID is always open for discussion as long as it isnot an RFC.


just want to be sure this doc. is NOT a done deal like
presented since a couple of meetings.

here below for the record

<http://www3.ietf.org/proceedings/06nov/minutes/pce.txt>
v03 has been reworked but does not provide answer to theconcerns expressed so far - quoting the doc.
> In PCE-based environments, it is critical to monitor> the state of the
critical for what ? if computation time is an issue whydelegate it (isn't that the safest assumption ?)
Looking like you are quibbling on words here ... Do wantto replace "critical" by "useful", is that your point ?


this does not answer the real question i think (the real
point is why is it "critical" or "useful" to receive this
info from the PCE while a local timer on the PCC can very
simply determine the delay experienced between initiation
of the request and the reception of the response)

> path computation chain for troubeshooting and> performance monitoring purposes:
troubleshooting of what ?
I think that I gave this explanation a few times already... Suppose that the head-end experiences long responsetimes. I think that we can agree on the fact that this ispotentially an issue., in which case the user may want totroubleshoot by issuing such a request in order todetermine the location (which PCE of the chain) and theroot cause.


does not help. if there are real issues an answer may never
be received and the information received may be erroneous
anyway

note also that it would be rather surprising that PCE and
PCC can not make use of local timers to determine such
kind of delays (as i said the PCC against the PCE and
each PCE against the peering PCE)

if there is congestion/troubles how would you ensurethe information received back is accurate ?
The information of the waiting times, CPU utilization,... is provided by the PCE.
The issue of accurateness is no different than for anyother information provided for example by the MIB.


if you seek at MIB information then i suspect that they
are also available at the SNMP management level ... why
a specific protocol mechanism to correlate it ?

i miss why LSR shall now start perform troubleshooting
of PCEs ... i thought that PCE where used to decrease
load on LSR not the contrary (PCE can be exceptionally
blocking / defectuous only otherwise resulting in more
troubles than being really improving performance)

> liveness of each element (PCE) involved in the PCE> chain,
if i well remember the PCE is a client-server model(fundamental assumption about the PCE approach) hence,why the client needs to know the "chain" of PCE servers ?
Strange question ... Try to operate a network and you'llimmediately figure out.


i don't think you have captured the question. the point
is not whether the server shall be accountable but why
the PCC shall be made aware about the "chain"

the notion of the PCE chain is confusing from this point
of view

Back to the previous case, consider the case of an inter-
domain TE deployment where the number of PCE chain maypotentially be large. Isn't useful during a troubleshootingevent for example to know the set of PCEs involved in aPCE chain ? (or to check a particular PCE chain).


well i am not sure to follow the operational process you
are referring to, if PCCs needs to check/monitoring and
validate each computed segment per PCE then it is also
better to maintain autonomous segment computation and
prevent the overhead of additional cross-verification

this triggers to me the following point, before devising
methods for troubleshooting and monitoring computational
results it may be appropriate to have a much better view
of the usage of PCE before progressing on "tools"

the same is ongoing at CCAMP where in-depth work has
been started in understanding the needs based on the
operational practice

> detection of potential resource contention states
a t[0] contention, at t[1] message for perf.mon -> arerunning conditions identical ? since probabilisticaly,not what is the expectation behind this mechanism ?
would it be possible to have a "curve" of the deteriorationof the performance as with an increasing number of computedpath the number of "monitoring messages" will also increase ?
Sure but ... this is true for all OAM tools.


nope. it has never been expected that the PCE gets dimensioned
to the total number of computed LSP over time (except for the
stateful PCEs) meaning that a stateless PCE was only expected
to support a certain number of request per unit of time (and
this independently of the accumulated number of computed paths)

If you use a ping to locate a congestion spot at the time youget your reply the network state may have changed 10 times ...but if you use it because of a sustained congestion state thenthe ping may help you to locate the problem.


echo request/reply is not equivalent to resource consumption
measurement / monitoring on certain nodes and number of cycles
required for path computation not sure to really capture the
analogy

The exact same reasoning applies here. Note that you can also
retrieve historical data computed over a period of time, thisis a matter of local configuration on the PCE, in which caseyou could retrieve the averaged (using for example a low pass
filter) computation time.


knowing the computational interval that could result from the
use of different PC algorithms and the various conditions (both
in terms of input and local resources) i am sure that off-line
statistics would be more suitable than real-time collection per
PCC

> statistics in term of path computation times are examples> of such metrics of interest.
interest to who and for which purpose (detection is fine butwhat is the issue to be solved) ?
Again sorry ... but strange question ... Interest to the userof course.


it is not a strange question it is the initial question: which
problem are you trying to solve then one could try to assess
which method is appropriate

Before fixing a problem it is usually useful to locate theroot cause of the problem. This tool may help for that purpose.


hence your problem is the computational limitation of a PCE
what a PCC could do about ? do you need to sending back a
time value is of any use (delta between send/receive allows
the PCC to obtain the same kind of information assuming that
the PCE remains "reachable" i.e. no communication channel issue)

like any system, PCE requires suitable planning anddimensioning wrt to perf.objectives i have impressionthat these fundamental design steps are skipped.
let's start discussion with this.
side note: the document states "In this document wecall a "state metric" a metric that characterizes aPCE state" -> need to define the latter.
Sure.


indeed, this question is still open since end of 2004.

-d.

JP.
thanks,
-d.


JP Vasseur wrote:
Hi,
Just let you know about an IPR disclosure that has been filed withthe IETF in relation tohttp://tools.ietf.org/id/draft-vasseur-pce-monitoring-03.txt. You cansee the disclosure at https://datatracker.ietf.org/ipr/872/ and seethe terms offered by the IPR claimant.
Thanks,
JP.
_______________________________________________
Pce mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/pce
.
.



_______________________________________________
Pce mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/pce

[Pce] Re: PCE monitoring doc

Reply via email to