If CAN does isntance selection and DSCP marking, then it can influence
routing and select appropraite instances. It is an understandable,
deployable, and probably scalable solution with the selection and
marking deployed at an appropriate place. If that place is the PE, and
we want to use a different IP address, then it probably uses a tunnel to
deliver the packets. If that place is the end host, then it can do what
it wants.
However, if you expect the routing system to be including and respecting
information about compute end point capabilities, and want the routing
system to manage suitable server stickiness and all the other needed
properties, then I think the version of CAN you are describing is a bad
idea that will harm the infrastructure by mixing funcitonality in
inappropriate places.
Yours,
Joel
PS: I retain rtgwg on the copy for now, but as far as I can tell this
discussion belongs exclusively on the dyncast list.
On 6/17/2022 4:13 AM, [email protected] wrote:
Hi Dirk,
For mode 1, CAN is only aware of computing information, because the
basic routing could select the 'best' path naturely.
For mode 2, CAN could also know more about the network path when the
computing node selection is done, for instance, SR policy, network
slicing, detnet, etc. and then utilize them. I don't think it will
influence the underlay routing, some apps could require for the
specific routing policy/strategy even there is no CAN service.
CAN aims to provide the joint optimization service to specific
applications. The difference is that whether to select the 'best'
resource all the time, or just select the 'appropriate' one based on
more awareness and decision making.
Regards,
Peng
------------------------------------------------------------------------
[email protected]
*From:* Dirk Trossen <mailto:[email protected]>
*Date:* 2022-06-17 15:01
*To:* Linda Dunbar <mailto:[email protected]>;
[email protected]; dyncast <mailto:[email protected]>
*CC:* rtgwg <mailto:[email protected]>; David R. Oran
<mailto:[email protected]>; jefftant.ietf
<mailto:[email protected]>
*Subject:* RE: [Dyncast] CAN BoF issues #7 #17 #32
Hi Linda, Peng, all,
Let us tease apart what “include the path selection” may mean
since the nature of this inclusion may be significant in difference.
For this, let us assume a service instance S_1 as one of possibly
several ones for service S. S_1 may be reachable over a number of
network paths, the selection of some of which would significantly
impact any compute-aware selection of S_1 over the other available
service instances for S. I can see two modes of ‘including path
selection”:
1.S_1 exposes two (or more) IP addresses, where each IP address
reflects a path from the client to the exposed address. IP
addresses may be exposed across more than one network operator,
multi-homing the service instance. Now here, ‘path selection’ is
indirectly done by picking one IP address over all others,
including the IP addresses of other service instances, and indeed,
such indirect path selection may well be done through a metric
that measures against (at least one) crucial path-related metric.
But ultimately, the CAN provider selects one of possibly many IP
address still, right? More importantly, it remains the task of the
underlay routing infrastructure (again, which could include more
than one network operator) to determine what it deems as the
‘best’ path to each of the IP addresses (including the multi-homed
S_1 addresses).
2.Let’s stick with one IP address to S_1 now though but there are
still at least two possible paths to it, where the selection of
one over any of the other possible ones could well impact the
compute-aware suitability of S_1 over any of the other service
instances. Problem here is that ‘including the path selection’
would mean to impact the routing to the single S_1 IP address in a
manner that that routing decision takes the compute-awareness into
account. The path selection here is not indirect but direct,
together with the IP address (i.e., service instance endpoint)
selection. What is required here is that CAN provider and underlay
somehow work together in selecting one path over another (to the
same IP address), which in turn would mean to impact the overall
routing decision for S_1’s IP address, which in turn would mean to
impact the underlay routing infrastructure since the resulting
(compute-aware) path configuration, in the form of suitable
forwarding entries, needs distribution in the underlay
infrastructure.
I think we have to be clear which of the two options we see in the
CAN scope but also if I may have missed options here. As we can
see already from those two options, they have a significant
impact on the architecture we may envision for CAN but also for
its solution adoption. From my side, I have seen CAN mainly as an
endpoint selection problem, so understood ‘path selection’ as an
indirect one in the manner described in item 1. I just want to
throw the options out here to solicit feedback from the community
on this so that we get a good understanding moving forward.
Best,
Dirk
*From:* Dyncast [mailto:[email protected]] *On Behalf Of
*Linda Dunbar
*Sent:* 15 June 2022 23:07
*To:* [email protected]; dyncast <[email protected]>
*Cc:* rtgwg <[email protected]>; David R. Oran
<[email protected]>; jefftant.ietf <[email protected]>
*Subject:* Re: [Dyncast] CAN BoF issues #7 #17 #32
Peng,
For Issue #32, you said: “CAN does not compute path, it selects
endpoints.”
If CAN means Computing Aware Networking, it should include the
path selection. Maybe CAN is about Selecting (or computing) the
optimal paths based on the combination of network conditions and
the end point computing available resources?
My two cents,
Linda
*From:* Dyncast <[email protected]> *On Behalf Of
*[email protected]
*Sent:* Monday, June 13, 2022 10:00 PM
*To:* dyncast <[email protected]>
*Cc:* rtgwg <[email protected]>; David R. Oran
<[email protected]>; jefftant.ietf <[email protected]>
*Subject:* [Dyncast] CAN BoF issues #7 #17 #32
Dear All,
Here are the responses to issues #7 #17 #32, any comments are
welcome! The issues and responses are also copied to the
questioner (
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fminutes-113-can%2F&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HA8ebRR0zU586fKOEn%2BX245pVB5wQ51BBnJjXYWD4dw%3D&reserved=0>https://datatracker.ietf.org/doc/minutes-113-can/
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fminutes-113-can%2F&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HA8ebRR0zU586fKOEn%2BX245pVB5wQ51BBnJjXYWD4dw%3D&reserved=0>)
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fminutes-113-can%2F&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HA8ebRR0zU586fKOEn%2BX245pVB5wQ51BBnJjXYWD4dw%3D&reserved=0>,
hope for further suggestions and confirmation. Thanks!
#7 This seems to assume conventional non-distributed applications
just running at the edge. What about modern frameworks like
Sapphire? and Ray?
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues%2F7&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=DpLlwOTLZ8V7gF%2B2JvSBIbXUnEqpEdpVWfYzv9IgRzA%3D&reserved=0>
It would be good to understand the multi-site requirements of such
frameworks, which seems to mainly run in single DCs.
_#17 Whether the interests of the organization deploying the
application and the organization providing the network
connectivity are aligned. Google doesn't worry about this because
they are both.
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues%2F17&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4%2B%2FmX48%2FoHZRp8m7xVV9kOitmL6pmfb56M%2F8bGPNNDM%3D&reserved=0>_
The question is more what the scope and semantic of information is
that will need to cross organizational boundaries. This needs
further study, in particular when assuming stakeholder division
between service and network provider.
_ #32 How to effectively compute paths? Shall we put CPUs into
account?
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues%2F32&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pEZtXQ54gaT4Bx4gwrKyJWyBLM6YImEwnSpg%2B5m%2FiO4%3D&reserved=0>_
CAN does not compute path, it selects endpoints. Path selection
(to a given endpoint) is subject to the routing at the IP
underlay. For selecting endpoints, CPU information may be taken
into account to achieve the 'compute-awareness' that CAN strives for.
You can also add your comments to any of
them(https://github.com/CAN-IETF/CAN-BoF-ietf113/issues
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=C2YRche0EjTbxhZWVwHSvYhN8OA7SCcfXhLSFA%2Bqnbk%3D&reserved=0>).
Regards,
Peng
------------------------------------------------------------------------
[email protected]
*From:*Linda Dunbar <mailto:[email protected]>
*Date:* 2022-05-11 06:11
*To:*[email protected] <mailto:[email protected]>
*Subject:* [Dyncast] Categories of the CAN BoF issues
CAN BoF proponents:
Many thanks for creating the CAN BoF issues tracking in
the Github:
https://github.com/CAN-IETF/CAN-BoF-ietf113/issues/created_by/CAN-IETF?page=1&q=is%3Aopen+is%3Aissue+author%3ACAN-IETF
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCAN-IETF%2FCAN-BoF-ietf113%2Fissues%2Fcreated_by%2FCAN-IETF%3Fpage%3D1%26q%3Dis%253Aopen%2Bis%253Aissue%2Bauthor%253ACAN-IETF&data=05%7C01%7Clinda.dunbar%40futurewei.com%7C4067359765a3464eebd408da4db152f5%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637907721259352014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ZqH4%2FI1csqsOVjpnw1TmFJJzMX86fCfPzgjbjfAnJHY%3D&reserved=0>
I went through the issues captured in the Github and
characterized them into groups. Some issues can be lumped
together for the discussion. There are quite a few issues
related to the requirements, which need to be clarified.
Best Regards, Linda
*Issues associated with Applications vs. Underlay networks:*
·Consider not to load underlay network with application
details. #35
·We have multiple upper layer application. Do we have
additional needs for routing(e.g. WG?) or we are using
those applications and won't need such new WG? #30
·It needs application information too, so it can't just
make a decision at the network layer. #23
·This is not striked as a routing problem; it's all
service discovery that can be done in higher layers. #21
·*3GPP and URSP solve this based on UPF selection. It uses
both endpoint + application. #20*
·One overlay plane per application. Resources/metric
specific to the plane. #19
·How does the application layer or the transport layer
learn the network status to steering traffic? #16
*Need more clear requirements for CAN (*to be addressed by
draft-liu-dyncast-ps-usecases*):*
·Need to understand if three are requirement to avoid
extra messages or 1ms of latency #36
·Regarding the flow affinity, is it from network
perspective or from application/computation perspective? #33
·How to effectively compute paths? Shall we put CPUs into
account? #32
·*What happens when the user moves? If so we also need to
move application context. #25*
·It can only move the services around as fast as it can
update the routing plane. which comes back to the point
about service discovery (waiting for
convergence/distribution as opposed to just updating the
SD server) #24
·Whether the interests of the organization deploying the
application and the organization providing the network
connectivity are aligned. Google doesn't worry about this
because they are both. #17
oThe question is more what the scope and semantic of
information is that will need to cross organizational
boundaries. This needs further study, in particular when
assuming stakeholder division between service and network
provider.
·It seems impossible to satisfy that requirement
simultaneously with the latency requirement. #15
·It wasn't clear that how hard of a requirement session
persistence is. #13
oA session usually creates ephemeral state. If execution
changes from one (e.g., virtualized) service instance to
another, state/context needs transfer to another. Such
required transfer of state/context makes it desirable to
have session persistence (or instance affinity) as the
default, removing the need for explicit context transfer,
while also supporting an explicit state/context transfer
(e.g., when metrics change significantly).
·*Should it select UPF based on the application? Steering
is done per user? or per application? #9*
·This seems to assume conventional non-distributed
applications just running at the edge. what about modern
frameworks like Sapphire? and Ray? #7
oIt would be good to understand the multi-site
requirements of such framework, which I have understood to
mainly run in single DCs.
·*Relation to 3GPP UPF #6*
·*Relation to ALTO #5*
·Do the mobility issues and associated protocols are also
in scope? There are scenarios where routing alone would
not be sufficient. #4
·What is the position in the edge location regarding to
UPF? #3
·Is there some sort of authorization model so that an edge
can indicate whether or not it will provide compute
services? #2
·*What is CNC and the relationship with CAN #1*
*Measurement of the Computing Resources*(to be addressed
by draft-du-computing-resource-representation):
·It is hard to use existing work to measure the
computation, but we can optimize the latency through the
performance monitoring. We have performance/measurement
matrix over there. #34
·Clarifications on the computing resource, its
requirements and characteristics would be helpful. #27
·Each application may have a different definition of
"resources" these then have to be boiled down into a
single topology Network Aware Computing (NAC! :) does
scale #14
·Is computing resource measurable? #10
oIt is, and how to use the measurement would be solution
related. See IFIP Networking 2022 paper on how to simply
expose “computing capability” and achieve better steering
with such simple measure.
·Why compute resource is different with other resources? #8
·
*Load Balance based solutions:*
·The point is that we need a standardized LB protocol #18
·The LB as part of the application itself is superior
(part of the distributed application itself is to obtain
and keep updating the "best" unicast location to use). #22
·If there is anything missing from current lbs that would
prevent their use as-is? other than there is for market
reasons no interop standard between different lbs? #12
·For the load balance, should it learn the network’s
status? #11
·
*Dyncast based Solution issues:*
·For Dyncast, when the time is short, is it possible for
the router to decide the routing? It is too fast. #31
·Is dyncast proposed to encapsulate? #29
·Will CAN dyncast impact each and every router? How to
avoid loops? #28
·What's the assumed scale of a D-router? 10 ^ 6 sessions?
100^ 8? What's the assumed update rate? !Gb? 1Tb? #26
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg