Re: [alto] Chair review of path-vector-13 (Part 1 of 2)

kaigao Mon, 22 Feb 2021 07:32:28 -0800

Hi Vijay and the ALTO WG,




This is a follow-up on the comments that are not fully address in the previous 
email. Please see below.




Thanks!




Best,

Kai



-----Original Messages-----
From:"Vijay Gurbani" <vijay.gurb...@gmail.com>
Sent Time:2021-02-08 23:35:13 (Monday)
To: draft-ietf-alto-path-vec...@ietf.org
Cc: "IETF ALTO" <alto@ietf.org>
Subject: Chair review of path-vector-13 (Part 1 of 2)


Chair review from beginning of document to the end of S6.6.
Part 1 of 2.

Major:
- S4.1, below Figure 2:  Note that we do not have "availbw" defined in ALTO as 
a current cost metric, so it is not a good idea to use it here without 
qualifying it further.  If used as is, it creates confusion.  My advice would 
be to either qualify the use of "availbw" as a hypothetical cost metric, or 
choose an actual cost metric from the performance-metric draft and restate the 
example.

- S4.1, "Case 1": I don't see how the "application will obtain 150 Mbps at 
most."  Consider that the bottleneck bandwidth is 100 Mbps, as that is the 
bandwidth of the most constrained link.  Once traffic leaves sw5, it can get no 
more than 100 Mbps on the remaining links.  So, I don't understand how the 
"application will obtain 150 Mbps at most."?  Perhaps I am missing something?

- S4.2.3: This paragraph, especially the second sentence onwards needs to be 
re-written to better flesh out the need.  Currently it says, "While both 
approaches...", however, it is not clear that there are two approaches being 
delineated from each other here.  It needs more edits so it reads better. (Some 
nits in this paragraph appear in the Nits section trying to tease out the 
language.)

- S5.1.3: When Section 5 begins, it says that "This section gives a 
non-normative overview of the Path Vector extension."  However, in S5.1.3, 
there is a normative "MUST".  (Same problem in S5.3, there are many "MUST"s 
there, and in Section 5.3.3 there are "RECOMMENDED" and "SHOULD NOT".)

Generally, I am a bit hesitant that certain subsections of Section 5 --- 
Section 5.3.2 in particular --- appear to contain normative behaviour, and this 
should be specified in a normative section, or do NOT start Section 5 by saying 
that this section gives a non-normative overview, and make this a normative 
section. I understand this is a major comment, so please think how you want to 
handle this carefully.

- S5.3.2: Not sure I follow the logic in the first paragraph.  As Fig. 4 
showed, there is one PV request, and if ALTO SSE extension is being used, 
presumably, it will contain the "client-id".  If the response contains a Path 
Vector resource, shouldn't that "client-id" simply apply to it?  I am sure I am 
missing something here as you have thought about this more than me; perhaps you 
could add a simple example to make the problem more explicit.

- S6.4: Why have a mini Security Considerations paragraphs in the subsections 
of S6.4, but not in the subsections of S6.3 and S6.5?  I am not saying that you 
remove the mini Security Considerations paragraphs, but if there are security 
considerations worth pointing out in S6.4, I suspect that there are security 
considerations worth pointing out in S6.3 and S6.5?  (One such security 
consideration is listed below in S6.5.1.)

- S6.4.2: "The persistent entity ID property is the entity identifier of the 
persistent ANE which an ephemeral ANE presents (See Section 5.1.2 for 
details)." ==> I am not sure what this means? Why is an ephemeral ANE 
presenting a persistent entity identifier?  Is it important that you are 
defining an ephemeral ANE and associating it with persistent entities?  If so, 
then please make this clear as there is a lot of ambiguity in this section.

- S6.5.1: What is the effect if the ALTO server chooses to obfuscate the path 
vector, causing the client to experience sub-optimal routing.  The client does 
not know that the server has obfuscated the path vector, so it MUST interpret 
the path vector as given to it by the ALTO server.  This raises the question 
whether such obfuscation, because it is indistinguishable from a non-obfuscated 
response, creates an attack on the client?  (Would a mini Security 
Consideration paragraph be appropriate here?)  Clearly, since ALTO assumes that 
the server is trusted to some degree, the issue becomes (a) can the client, by 
repeated querying, figure out that it is being duped on occasion?  (b) what 
does it then do?





[PV] The effects are highly implementation-specific, and it is true that
  obfuscation may create an attack on the client by compromising the integrity
  of ALTO information. As we discuss in Section 11, there are some obfuscation
  methods that can preserve the integrity of the information.

  Regarding the last two issues, the answer to (a) is also
  implementation- and network-specific, if the obfuscation is idempotent, i.e.,
  generating the same obfuscated results for the same request, a client will not
  be able to figure out that it is being duped; even if a client sees two
  different results, it may still be the consequences of internal network
  changes; for the answer to (b), we feel that it does not fall out of the scope
  of Sec 15.2 in RFC 7285.

  Instead of expanding the security discussion in Sec 6.5.1, the proposed change
  is to move the security consideration on the integrity to Sec 11 (security
  consideration), as reduction/obfuscation are usually introduced as mechanisms
  of protecting confidentiality.




OLD:

  To mitigate this risk, the ALTO server should consider protection
   mechanisms to reduce information exposure or obfuscate the real
   information, in particular, in settings where the network and the
   application do not belong to the same trust domain.  But the
   implementation of Path Vector extension involving reduction or
   obfuscation should guarantee the requested properties are still
   accurate, for example, by using minimal feasible region compression
   algorithms [TON2019] or obfuscation protocols [SC2018][JSAC2019].




NEW:

   To mitigate this risk, the ALTO server should consider protection
   mechanisms to reduce information exposure or obfuscate the real
   information, in particular, in settings where the network and the
   application do not belong to the same trust domain.  For example, in
   the multi-flow bandwidth reservation use case as introduced in
   Section 4, only the available bandwidth of the shared bottleneck link
   is crucial, and the ALTO server may only preserve the critical
   bottlenecks and can change the order of links appearing in the Path
   Vector response.

   However, arbitrary reduction and obfuscation of information exposure
   may potentially introduce a risk on the integrity of the ALTO
   information, leading to infeasible or suboptimal decisions of ALTO
   clients,

   To mitigate this risk, if an ALTO client finds that the traffic
   distribution based on the Path Vector information is not feasible
   (e.g., causing constant congestion) or not better than a distribution
   which does not fully conform to the information (e.g., by randomly
   choosing the source/destination for certain flows), it can follow the
   protection strategies for potential undesirable guidance from
   authenticated ALTO information, specified in Section 15.2.2 of RFC
   7285 [RFC7285].  While repeatedly sending the same query can
   potentially detect the integrity problem for certain obfuscation
   methods (e.g., those based on time or randomness) under certain
   network conditions (e.g., where the routing and ANE properties are
   stable), an ALTO client must be aware that this behavior may be
   considered as a denial-of-service attack on the server and may lead
   to the rejection of further requests from the client.

   On the other hand, this risk can also be mitigated from the server
   side.  While the implementation of an ALTO server is beyond the scope
   of this document, implementations of ALTO servers involving reduction
   or obfuscation of the Path Vector information should consider
   reduction/obfuscation mechanisms that can preserve the integrity of
   ALTO information, for example, by using minimal feasible region
   compression algorithms [TON2019] or obfuscation protocols
   [SC2018][JSAC2019].




Minor:

- S1, paragraph 3: Why would "job completion time" be shared by bottleneck  
network links?  On first glance, job completion time is a function of the  
compute resources on the host not network links, but on further reflection,   
job completion time could also be a function of the network links on the host 
if the data needs to be marshalled to the job (process) in order for it to 
complete.  If so, then perhaps reword as:
 
 OLD:
 For example, job completion time, which is an important QoE metric for a 
large-scale data analytics application, is impacted by shared bottleneck links 
inside the carrier network.

 NEW:
 For example, job completion time, which is an important QoE metric  for a 
large-scale data analytics application, is impacted by shared  bottleneck links 
inside the carrier network as link capacity may  impact the rate of data 
input/output to the job.

- S5.1.1: "Thus they must follow the mechanisms specified in the 
[i-D.ietf-alto-unified-props-new]." ==> Here, it may help to point to a 
specific section of the I-D you want the implementer to follow the mechanisms 
of.  Do you mean the naming mechanism defined in the I-D?  The inheritance 
mechanism defined in the I-D?

- S5.1.2: How does the client know that an ANE in a response is ephemeral 
versus persistent?  You answer this question in Section 6.4.2, perhaps you can 
put a forward reference to Section 6.4.2 as I am sure other readers will have 
the same question.

- S6.2.4: "...their entity domain names MUST be ".ane"..." ==> MUST be .ane or 
MUST use the .ane prefix?  I can't tell.  Please specify this better through an 
example as well.  You do have an example in the last paragraph, but the writing 
of the example is ambiguous.  My understanding is: ".ane:NET1" is an ephemeral 
ANE, while "dc-props.ane:DC1" is a persistent ANE.  Is that correct?  If so, 
just explicitly mention this.

Nits:

- S4.1: s/the scheduling.  However,/the scheduling, however,/

- S1, paragraph 3: s/applications, however, the/applications, the/

- S1, paragraph 5: s/in a huge volume/in an increase in volume/

- S1: s/The pressure on the/The requirements on the/

- S1: s/ALTO server convey/ALTO server to convey/

- S1: s/that each identifies/that identifies/
  or
      s/that each identifies/, each element of which identifies/

- S3: s/in a cost map or for a/in a cost map, or for a/

- S4.2.1: s/Gigabytes, Terabytes, and even Petabytes/gigabytes, terabytes, and 
even petabytes/
(Reason: there is no need to gratuitously capitalize these.)

- S4.2.1: s/related to the completion time of the slowest data 
transfer./related to the data transfer time over the slowest link./

- S4.2.1: s/the Path Vector extension/the extension defined in this document/
(This is repeated in S4.2.2 and perhaps elsewhere, please consider it as a 
request for global change.)

- S4.2.2: s/It is getting important/It is important/

- S4.2.3: s/may have to make/will need/

- S4.2.3: s/and potentially with/and potentially need/


- S5.2: s/, meaning the/, this means that the/


Thanks,



- vijay

_______________________________________________
alto mailing list
alto@ietf.org
https://www.ietf.org/mailman/listinfo/alto

Re: [alto] Chair review of path-vector-13 (Part 1 of 2)

Reply via email to