Resending. Somehow the last email was cut off. Nalini From: "nalini.elk...@insidethestack.com" <nalini.elk...@insidethestack.com> To: jouni korhonen <jouni.nos...@gmail.com>; General Area Review Team <gen-art@ietf.org>; "draft-ietf-ippm-6man-pdm-option....@ietf.org" <draft-ietf-ippm-6man-pdm-option....@ietf.org> Cc: Michael Ackermann <mackerm...@bcbsm.com>; Robert Hamilton <rhamil...@cas.org>; MORTON ALFRED C (AL) <acmor...@att.com> Sent: Tuesday, February 21, 2017 9:59 AM Subject: Re: Gen-ART review of draft-ietf-ippm-6man-pdm-option: Server Time Jouni,
Here is the thread for your second major comment: Major comment #2 from you: >2) The PDM option relation to actual "server" time is somewhat confusing and >the 5-tuple does not allow me to detect the real relationship between the >>server/application action that caused the generation of the packet and the >PDM within the packet. This is specifically an issue with >transport/application protocols >that multiplex/interleave multiple >application streams into one transport. I have no idea of the actual >individual application time since the packets get generated >independent of >the processing of a single thread. I would welcome some discussion around >here. Section 1.4 last paragraph is going to this direction but is not >>sufficient IMHO. Yes, you are, of course, correct that all traffic will flow between the matching ports at the two endpoints. The 5-tuples will match regardless of the application. The thing is that we never intended that PDM would distinguish between applications using the same 5-tuple. That is, it is a feature, not a bug. What PDM WILL tell you is whether the problem is in the network or the host. In our experience, which is primarily on networks for large data centers, there is a different group which is involved to troubleshoot the problem depending on the nature of the problem. That is, do I get the application developers on the line or the team that deals with the routers & infrastructure. One of the important functions of PDM is to allow you to do quick triage so that you can get the right SWAT team going. PDM does not tell you if the problem is in the IP stack or the application or buffer allocation. PDM also does not tell you which of the network segments or middle boxes is at fault. The reason for PDM is to get the right specialists in place who can then be dispatched to investigate their area. In our experience, valuable time is often lost at this first stage of triage. Both the network group and the application group have quite a few specialized tools at their disposal to further investigate their own areas. I am adding some of this verbiage to section 1.4. Please see below: CURRENT ----------- 1.4 Rationale for defined solution The current IPv6 specification does not provide timing nor a similar field in the IPv6 main header or in any extension header. So, we define the IPv6 Performance and Diagnostic Metrics destination option (PDM). Advantages include: 1. Real measure of actual transactions. 2. Independence from transport layer protocols. 3. Ability to span organizational boundaries with consistent instrumentation 4. No time synchronization needed between session partners 5. Ability to handle all transport protocols (TCP, UDP, SCTP, etc) in a uniform way The PDM provides the ability to determine quickly if the (latency) problem is in the network or in the server (application). More intermediate measurements may be needed if the host or network discrimination is not sufficient. At the client, TCP/IP stack time vs. application time may still need to be broken out by client software. NEW---- 1.4 Rationale for defined solution The current IPv6 specification does not provide timing nor a similar field in the IPv6 main header or in any extension header. So, we define the IPv6 Performance and Diagnostic Metrics destination option (PDM). Advantages include: 1. Real measure of actual transactions. 2. Independence from transport layer protocols. 3. Ability to span organizational boundaries with consistent instrumentation 4. No time synchronization needed between session partners 5. Ability to handle all transport protocols (TCP, UDP, SCTP, etc) in a uniform way The PDM provides the ability to determine quickly if the (latency) problem is in the network or in the server (application). That is,it is a fast way to do triage. One of the important functions of PDM is to allow you to do quickly dispatchthe right set of diagnosticians. Within network or server latency,there may be many components. The job of the diagnostician is to ruleeach one out until the culprit is found. How PDM fits into this diagnostic picture is that PDM will quickly tell you how to escalate. PDM will point to either the network area or theserver area. Within the server latency, PDM does not tell you if the bottleneckis in the IP stack or the application or buffer allocation. Within the network latency, PDM does not tell you which of the network segments or middle boxes is at fault. What PDM will tell you is whether the problem is in the network or the server. In our experience, there is often a different group which is involved to troubleshoot the problem depending on the nature of the problem. That is, the problem may be escalated to the application developersor the team that deals with the routers and infrastructure. Both the network group and the application group have quite a few specialized tools at their disposal to further investigate theirown areas. What is missing is the first step, which PDM provides. In our experience, valuable time is often lost at this first stage of triage. PDM is expected toreduce this time substantially. Thanks, Nalini Elkins Inside Products, Inc. www.insidethestack.com (831) 659-8360 ________________________________ From: jouni korhonen <jouni.nos...@gmail.com> To: General Area Review Team <gen-art@ietf.org>; draft-ietf-ippm-6man-pdm-option....@ietf.org Sent: Friday, September 23, 2016 11:14 AM Subject: Gen-ART review of I am the assigned Gen-ART reviewer for this draft. The General Area Review Team (Gen-ART) reviews all IETF documents being processed by the IESG for the IETF Chair. Please wait for direction from your document shepherd or AD before posting a new version of the draft. For more information, please see the FAQ at <http://wiki.tools.ietf.org/ar ea/gen/trac/wiki/GenArtfaq>. Document: draft-ietf-ippm-6man-pdm-option-05 Reviewer: Jouni Korhonen Review Date: 9/23/2016 IETF LC End Date: 2016-09-28 IESG Telechat date: (if known) Summary: The draft needs some work. Major issues: I have two technical issues here: 1) There is no mention of what is the time reference plane for internal time stamping. All other timing and synchronization related documents I am aware of (at least outside IETF) describe it very clearly where in the processing/packet handling the time stamp is to be taken. Now the document gives me no idea as an implementer where that should take place. At least it makes it hard to calculate the *network* RTT precisely. 2) The PDM option relation to actual "server" time is somewhat confusing and the 5-tuple does not allow me to detect the real relationship between the server/application action that caused the generation of the packet and the PDM within the packet. This is specifically an issue with transport/application protocols that multiplex/interleave multiple application streams into one transport. I have no idea of the actual individual application time since the packets get generated independent of the processing of a single thread. I would welcome some discussion around here. Section 1.4 last paragraph is going to this direction but is not sufficient IMHO. Minor issues: 1) This is a larger editorial issue. The document is far too long with a lot of repetition considering it describes only one IPv6 destination option. It is a writing style issue and I am fully aware of that. I have proposals how to cut text in the editorial comments section. 2) Section 1.2 3rd paragraph talks about IoT and that speed matters there. I find this too generalized statement. There are many other things that matter in this application domain and speed might not be that important as being able to send/receive that one to two bytes of data in a given time window. I suggest removing this paragraph. Nits/editorial comments: 1) Section 1.4 numbered list: add missing full stops. 2) Section 3.2: remove "The 5-tuple consists of the source and destination IP addresses, the source and destination ports, and the upper layer protocol (ex. TCP, ICMP, etc)." since this is unnecessary repetition. 3) Section 3.2: remove "Operating systems MUST NOT implement a single counter for all connections." Seems again like unnecessary repetition to previous sentence. 4) Section 3.2 again unnecessary repetition of IPv6 basics that can be read from RFC2460. Suggest strongly to remove: "This indicates the following processing requirements: 00 - skip over this option and continue processing the header. RFC2460 [RFC2460] defines other values for the Option Type field. These MUST NOT be used in the PDM." and "The possible values are as follows: 0 - Option Data does not change en-route 1 - Option Data may change en-route The three high-order bits described above are to be treated as part of the Option Type, not independent of the Option Type. That is, a particular option is identified by a full 8-bit Option Type, not just the low-order 5 bits of an Option Type." 5) Section 3.3 same as in comment 4). Suggest strongly removing: "This follows the order defined in RFC2460 [RFC2460] IPv6 header Hop-by-Hop Options header Destination Options header <-------- Routing header Fragment header Authentication header Encapsulating Security Payload header Destination Options header <------------ upper-layer header" 6) Suggest removing entire Section 3.4 and moving the following text to Section 3.3: "PDM MUST be placed before the ESP header in order to work. If placed before the ESP header, the PDM header will flow in the clear over the network thus allowing gathering of performance and diagnostic data without sacrificing security." 7) Section 3.6 suggest removing the following text. I see no value it would add to what has already been said: "As with all other destination options extension headers, the PDM is for destination nodes only. As specified above, intermediate devices MUST neither set nor modify this field." 8) Section 3.6 suggest removing the following 5-tuple text as it has already been described earlier in Section 2: "The 5-tuple is: SADDR : IP address of the sender SPORT : Port for sender DADDR : IP address of the destination DPORT : Port for destination PROTC : Protocol for upper layer (ex. TCP, UDP, ICMP)" 9) Sections 4.2 and 4.3 suggest removing them entirely. I see what value these sections add. I acknowledge they are good to know information of timer hardware implementation difference but do not really add value on the on-wire encoding of the PDM option. 10) Section 4.4 suggest removing the entire section. Time Base was already described in detail enough in Section 3.2. 11) Section 4.5 time base for picoseconds is 11 not 00. 12) Section 4.5 suggest removing the following text, since it does not add any more clarity to what has already been said in my opinion. This is because all the examples follow nice nybble increment in scaling: "Sample binary values (high order 16 bits taken) 1 psec 1 0001 1 nsec 3E8 0011 1110 1000 1 usec F4240 1111 0100 0010 0100 0000 1 msec 3B9ACA00 0011 1011 1001 1010 1100 1010 0000 0000 1 sec E8D4A51000 1110 1000 1101 0100 1010 0101 0001 0000 0000 0000" 12) Section 4.6 I do not understand why this section is here. I strongly suggest removing it. Sections 4.5 and 3.2 already describe how I would encode the delta time using scaling as a separate fields not embedded (option fields ScaleDTLR and ScaleDTLS). Did I misunderstand something here? 13) Section 5 suggest removing the following text because of it repeating what has already been said earlier: "Each packet, in addition to the PDM contains information on the sender and receiver. As discussed before, a 5-tuple consists of: SADDR : IP address of the sender SPORT : Port for sender DADDR : IP address of the destination DPORT : Port for destination PROTC : Protocol for upper layer (ex. TCP, UDP, ICMP) It should be understood that the packet identification information is in each packet. We will not repeat that in each of the following steps." 14) Section 5.3 suggest merging the following text into one example and do necessary rewording. There is no need to do the same calculation twice on almost adjacent lines: "Sending time : packet 2 - receive time : packet 1 We will call the result of this calculation: Delta Time Last Received (DELTATLR) That is: Delta Time Last Received = (Sending time: packet 2 - receive time: packet 1)" 15) Expand RTT and PSN on their first use. Phew.. after all this I found the document good reading and most likely a useful tool to be used. Regards, Jouni
_______________________________________________ Gen-art mailing list Gen-art@ietf.org https://www.ietf.org/mailman/listinfo/gen-art