Re: Delivery Service Origin Refactor

Rawlin Peters Tue, 13 Mar 2018 10:42:52 -0700

replies inline

On Mon, Mar 12, 2018 at 5:21 PM, Nir Sopher <n...@qwilt.com> wrote:
> Thank you Rawlin for the clarification:)


You're welcome. Anything I can do to help :)

>
> Still, I feel like I'm missing a piece of the puzzle here.
> Maybe I do no understand the relations of "origin" and "steering target"
>
> As I see it the router job is to send end users to the optimal cache. It
> has 2 tools for doing so: CZF and Geo
> Using the CZF is preferable, as it is based on the real network topology.
> Geo is a best effort solution, used when we cannot do better. It is not
> necessarily optimal, and has GEO misses, but we must use it since we cannot
> map all IPs.


Yes, the client's location will be found from the CZF first, falling
back to GEO upon a CZF-miss. Then the most optimal edge cachegroup is
chosen for each steering target deliveryservice. Then, the resulting
list of target deliveryservices will be sorted by total distance
following the path from client -> edge -> origin.

>
> The cache job is to fetch the content and serve the user.
> It can be optimized to bring the content from the optimal Origin. It can be
> configured to do so by specifying the best origin per cache group (in ops
> DB).


This is intentionally done as a CLIENT_STEERING deliveryservice so
that a smart client can make the decision to use a different
deliveryservice upon failure. If this decision was made at the caching
proxy level, it would end up being like an optimized version of MSO
(multi-site origin) where the client only has a single URL to request
and the most optimal origin of multiple origins is chosen by the
caching proxy. I don't think that's a bad idea; it's just not the
architecture we want for this. By doing it as client steering we can
also assign weights/ordering between colocated origins and update
those steering assignments at any time. We can form the steering
target list very flexibly this way.


> I might be naive here, but as the amount of cache groups is reasonable, and
> their network location is much clearer the the end user location, the
> mapping and configuration would be reasonable. Therefore, using sub-optimal
> Geo as a tool for choosing the Origin can be avoided.


In practice, you could set the coordinates of the Origin to that of
the most optimal cachegroup, rather than assigning the Origin directly
to said cachegroup. The effect would be the same I believe.

>
> I also did not understand if the suggestion is to use the client location
> for choosing the origin, or the cache group location for choosing the
> origin.
> Using the client location for choosing the origin practically ignores the
> accurate information provided by the CZF.


It's a combination of the client location, the edge location, and the
origin location (total distance from client -> edge -> origin).

>
> What am I missing?
> 10x
> Nir
>
> On Mon, Mar 12, 2018 at 11:19 PM, Rawlin Peters <rawlin.pet...@gmail.com>
> wrote:
>
>> Hey Nir,
>>
>> I think part of the motivation for doing this in Traffic Router rather
>> than the Caching Proxy is separation of concerns. TR is already
>> concerned with routing a client to the best cache based upon the
>> client's location, so TR is already well-equipped to make the decision
>> of how Delivery Services (origins) should be prioritized based upon
>> the client's location. That way the Caching Proxy (e.g. ATS) doesn't
>> need to concern itself with its own location, the client's location,
>> and the location of origins; it just needs to know how to get the
>> origin's content and cache it. All the client needs to know is that
>> they have a prioritized list of URLs to choose from; they don't need
>> to be concerned about origin/edge locations because that
>> prioritization will be made for them by TR.
>>
>> The target DSes will have different origins primarily because they
>> will be in different locations, and the origins should be
>> interchangeable in terms of the content they provide because a smart
>> client may fail over to any of the target DSes in a CLIENT_STEERING DS
>> for the same content.
>>
>> - Rawlin
>>
>> On Mon, Mar 12, 2018 at 2:37 PM, Nir Sopher <n...@qwilt.com> wrote:
>> > Hi Rawlin,
>> > Can you please add a few word for the motivation behind basing the
>> steering
>> > target selection on the location of the client?
>> > As the content goes through the caches, isn't it the job of the cache to
>> > select the best origin for the cache?  Why the client should be the one
>> to
>> > take the origin location into consideration?
>> > Why the target DSes have different origins in the first place? Are they
>> > have different characteristics additionally to their location?
>> > Thanks,
>> > Nir
>> >
>> > ---------- Forwarded message ----------
>> > From: Rawlin Peters <rawlin.pet...@gmail.com>
>> > Date: Mon, Mar 12, 2018 at 9:46 PM
>> > Subject: Delivery Service Origin Refactor
>> > To: dev@trafficcontrol.incubator.apache.org
>> >
>> >
>> > Hey folks,
>> >
>> > As promised, this email thread will be to discuss how to best
>> > associate an Origin Latitude/Longitude with a Delivery Service,
>> > primarily so that steering targets can be ordered/sent to the client
>> > based upon the location of those targets (i.e. the Origin), a.k.a.
>> > Steering Target Geo-Ordering. This is potentially going to be a pretty
>> > large change, so all your feedback/questions/concerns are appreciated.
>> >
>> > Here were a handful of bad ideas I had in order to accomplish this DS
>> > Origin Lat/Long association (feel free to skip to PROPOSED SOLUTION
>> > below):
>> >
>> > 1. Reuse the current MSO (multisite origin) backend (i.e. add the
>> > origin into the servers table, give it a lat/long from its cachegroup,
>> > assign the origin server to the DS)
>> > Pros:
>> > - reuse of existing db schema, probably wouldn't have to add any new
>> > tables/columns
>> > Cons:
>> > - MSO configuration is already very complex
>> > - for the simple case of just wanting to give an Origin a lat/long you
>> > have to create a server (of which only a few fields make sense for an
>> > Origin), add it to a cachegroup (only name and lat/long make sense,
>> > won't use parent relationships, isn't really a "group" of origins),
>> > assign it to a server profile (have to create one first, no parameters
>> > are needed), and finally assign that Origin server to the delivery
>> > service (did I miss anything?)
>> >
>> > 2. Add Origin lat/long columns to the deliveryservice table
>> > Pros:
>> > - probably the most straightforward solution for Steering Target
>> > Geo-Ordering given that Origin FQDN is currently a DS field.
>> > Cons:
>> > - doesn't work well with MSO
>> > - could be confused with Default Miss Lat/Long
>> > - if two different delivery services use colocated origins, the same
>> > lat/long needs entered twice
>> > - adds yet another column to the crowded deliveryservice table
>> >
>> > 3. Add origin lat/long parameters to a Delivery Service Profile
>> > Pros:
>> > - Delivery Services using colocated origins could share the same profile
>> > - no DB schema updates needed
>> > Cons:
>> > - profile parameters lack validation
>> > - still doesn't support lat/long for multiple origins associated with a
>> DS
>> >
>> > 4. Add the lat/long to the steering target itself (i.e. where you
>> > choose weight/order, you'd also enter lat/long)
>> > Pros:
>> > - probably the easiest/quickest solution in terms of development
>> > Cons:
>> > - only applies lat/long to a steering target
>> > - using the same target in multiple Steering DSes means having to keep
>> > the lat/long synced between them all
>> > - lat/long not easily reused by other areas that may need it in the
>> future
>> >
>> >
>> >
>> > PROPOSED SOLUTION:
>> >
>> > All of those ideas were suboptimal, which is why I think we need to:
>> > 1. Split Locations out of the cachegroup table into their own table
>> > with the following columns (cachegroup would have a foreign key to
>> > Location):
>> > - name
>> > - latitude
>> > - longitude
>> >
>> > 2. Split Origins out of the server and deliveryservice tables into
>> > their own table with the following columns:
>> > - fqdn
>> > - protocol (http or https)
>> > - port (optional, can be inferred from protocol)
>> > - location (optional FK to Location table)
>> > - deliveryservice FK (if an Origin can only be associated with a
>> > single DS. Might need step 3 below for many-to-many)
>> > - ip_address (optional, necessary to support `use_ip_address` profile
>> > parameter for using the origin's IP address rather than fqdn in
>> > parent.config)
>> > - ip6_address (optional, necessary because we'd have an ip_address
>> > column for the same reasons)
>> > - profile (optional, primarily for MSO-specific parameters - rank and
>> > weight - but I could be convinced that this is unnecessary)
>> > - cachegroup (optional, necessary to maintain primary/secondary
>> > relationship between MID_LOC and ORG_LOC cachegroups for MSO)
>> >
>> > 3. If many-to-many DSes to Origins will still be possible, create a
>> > new deliveryservice_origin table to support a many-to-many
>> > relationship between DSes and origins
>> > - the rank/weight fields for MSO could be added here possibly, maybe
>> > other things as well?
>> >
>> > 4. Consider constraints in the origin and deliveryservice_origin table
>> > - must fqdn alone be unique? fqdn, protocol, and port combined?
>> >
>> > The process for creating a Delivery Service would change in that
>> > Origins would have to be created separately and added to the delivery
>> > service. However, to aid migration to the new way of doing things, our
>> > UIs could keep the "Origin FQDN" field but the API backend would then
>> > create a new row in the Origin table and add it to the DS. More
>> > Origins could then be added (for MSO purposes) to the DS via a new API
>> > endpoint. MSO configuration would change at least in how Origins are
>> > assigned to a DS ("server assignments" would then just be for
>> > EDGE-type servers).
>> >
>> > Cachegroup creation also changes in that Locations need to be created
>> > before associating them to a Cachegroup. However, our UIs could also
>> > stay the same with the backend API updated to create a Location from
>> > the Cachegroup request and tie it to the Cachegroup.
>> >
>> >
>> >
>> > I know there are a lot of backend and frontend implications with these
>> > changes that would still need to be worked out, but in general does
>> > this proposal sound good? Questions/concerns/feedback welcome and
>> > appreciated!
>> >
>> > - Rawlin
>>

Re: Delivery Service Origin Refactor

Reply via email to