Alright, I'm going to try to walk through this from the dev
perspective a bit too. Say we add a go_direct column (non-null,
default false) to the deliveryservice table. We can't add a new
required field to the API without breaking it, so the API backend has
to do a check like this before inserting the DS into the DB:

if go_direct is undefined:
    if ds_type in (HTTP, DNS, HTTP_LIVE_NATNL, DNS_LIVE_NATNL, etc...):
        go_direct = false
    else:
        go_direct = true

That means we would need to have some form of that code in the DS API
as well as Traffic Portal so that someone doesn't accidentally shoot
themselves in the foot and create a DS that bypasses the mid tier
without them realizing it. With new DS types, we don't need
conditional checks like that or a new column in the DS table. Since
the types would still be in the DNS/HTTP families, we wouldn't have to
change CRConfig or everywhere else that discriminates between
DNS/HTTP. We'd really only have to change the code that generates
parent.config to check for the new types and set go_direct
accordingly.

By adding new type-conflicting fields like go_direct, it also makes it
slightly more difficult to add new DS types in the future (which we
may have to do for anycast-routed DSes) because we'd have to maintain
all the instances of the code snippet above and make sure the new
types are handled properly with all the type-conflicting fields.

We'd also have to add an asterisk to the DS types in the documentation like:
HTTP_LIVE_NATNL: same as HTTP_LIVE except the mid-tier is NOT bypassed*
*unless it IS bypassed by setting go_direct=true on the Delivery Service

Which would make choosing a DS type way more confusing IMO. It's like
this should obsolete the _LIVE types because you could just choose
_LIVE_NATNL but set go_direct=true to make it _LIVE?

I think part of the problem really is the fact that you can't easily
change the DS type like you said, which actually is only prevented in
the UI (probably for safety reasons) but not the API. If we had types
like HTTP and HTTP_BYPASS, the UI should allow switching between those
two types because that would be relatively safe. If you know what
you're doing and understand the consequences of changing the DS type,
I think you should be able to do that. That said, you can switch types
from HTTP_LIVE_NATNL to HTTP_LIVE today just to bypass the mid tier in
the scenario where all your mids are down. Maybe we should just
support that transition for more types?

- Rawlin

On Tue, Jun 26, 2018 at 8:34 PM Eric Friedrich (efriedri)
<efrie...@cisco.com.invalid> wrote:
>
> Hey Rawlin-
>   Remember that the proposed feature does not always bypass the mids- we 
> already have that with the regional DS types. This is just a failure recovery 
> behavior.
>
>   Conflicts with the DS type is only a small part of the major UX problem we 
> have today relating to Origin FQDNs. Ways we can break today:
>    - Configure the same Origin server base as both live and vod on the same 
> edge
>    - configure the same origin server in different delivery services with 
> conflicting MSO policies
>    - configure the same origin in different delivery services with different 
> go_direct policies (new by the proposed change, regardless of configuring as 
> a DS type or a true/false)
>
> I don’t think its fair to ask this small feature to be the one to start 
> fixing these long pre-existing problems without a bigger conversation around 
> how they should be fixed (unique origins per-DS perhaps? But that comes with 
> its own set of problems!)
>
>
> I’m opposed to new DS types for the following reasons
>
> 1) Proliferation of DS types. The list is already pretty long and confusing 
> and adding this feature as DS types would add another 4 to that list, when a 
> simple True/False dropdown would suffice
>
> 2) Purpose of the DS types- I’ve always thought of the DS type as something 
> that defines a major property of the Delivery Service - Live vs VOD, HTTP vs 
> DNS. This is a minor feature which is not active 99% of the time and for the 
> most part should not influence behavior except in an extreme failure 
> conditions (I.e. all parent and secondary parent mids of an edge cache)
>
> 3) Ability to change the go_direct setting. DS types are fixed today and 
> cannot be modified for a delivery service. It would be nice if customers had 
> the ability to turn this feature on and off without needing to delete and 
> recreate the entire delivery service
>
> 4) Consistency with existing features. I think this feature lines up pretty 
> well against MSO as far as its behavior and I don’t think that we should 
> convert MSO over to a DS type.
>
> —Eric
>
> > On Jun 26, 2018, at 5:04 PM, Rawlin Peters <rawlin.pet...@gmail.com> wrote:
> >
> > What makes having more types more confusing? It fits into the current
> > "DS type determines whether or not the mids are bypassed" paradigm
> > that exists today. In my opinion, a new DS field that conflicts with
> > your chosen type in certain cases is not very intuitive and leads to
> > bad UX. Sure, having more DS types to choose from isn't exactly ideal
> > either, but when the description of type "<type>_BYPASS" is just "the
> > same as <type> except the mid tier is bypassed when all mids are down"
> > I think it would be pretty straightforward. In this case I think new
> > DS types is the lesser of two evils.
> >
> > - Rawlin
> >
> > On Tue, Jun 26, 2018 at 11:42 AM Vijay Anand
> > <vijayanand.jayaman...@gmail.com> wrote:
> >>
> >> To me it looks like DS types with go_direct TRUE/False will add more
> >> confusion compared to adding go_direct value as a DS configurable parameter
> >> defaulting it to false except for HTTP_LIVE DNS_LIVE and HTTP_NO_CACHE.
> >>
> >>
> >>
> >> Thanks,
> >> Vijayanand S
> >>
> >> On Thu, Jun 21, 2018 at 4:53 AM, Rawlin Peters <rawlin.pet...@gmail.com>
> >> wrote:
> >>
> >>> I think adding new DS types for this makes sense because traditionally
> >>> the DS type determines the value of go_direct as well as how content
> >>> is cached (disk/ram/not cached at all). If we make the field directly
> >>> configurable on the Delivery Service, then we now have the complexity
> >>> of prohibiting certain go_direct values in certain types of delivery
> >>> services, and that adds to the cognitive load required to create/edit
> >>> a delivery service.
> >>>
> >>> For example, it's a bad experience for the user to find out through an
> >>> API error that they are prohibited from setting go_direct to certain
> >>> values in certain scenarios. If we just create new types for that
> >>> instead, the user doesn't have to worry about conflicting settings in
> >>> specific DS types and rather just has to choose the DS type that they
> >>> want (where the proper/best settings are chosen for them under the
> >>> hood).
> >>>
> >>> Basically we'd need 4 new types (naming could be different):
> >>> HTTP_LIVE_NATNL_BYPASS
> >>> DNS_LIVE_NATNL_BYPASS
> >>> HTTP_BYPASS
> >>> DNS_BYPASS
> >>>
> >>> The new types would mimic the matching original types except for
> >>> setting go_direct=true, which would allow the edges to fetch from the
> >>> origin when all the mids are down.
> >>>
> >>> - Rawlin
> >>>
> >>> On Wed, Jun 20, 2018 at 6:33 AM, Vijay Anand
> >>> <vijayanand.jayaman...@gmail.com> wrote:
> >>>> All,
> >>>>
> >>>> The PR given below is a perl implemention for making parent.config's
> >>>> go_direct directive configurable via Delivery service
> >>>> https://github.com/apache/trafficcontrol/pull/2407
> >>>>
> >>>> This PR has been hosted to discuss about various approaches to make
> >>>> go_direct configurable. GoLang implementation will be added once we
> >>>> finalize on the approach.
> >>>>
> >>>> Background:
> >>>> ---------------
> >>>> Right now it is hard coded as false in parent.config for DS of types
> >>> other
> >>>> than HTTP_LIVE, HTTP_NO_CACHE, DNS_LIVE and hence for such delivery
> >>>> services, if there occurs a problem in the network / in the Mids and they
> >>>> become unreachable / offline, all the requests fail because of this hard
> >>>> coded GO_DIRECT setting.
> >>>>
> >>>> By making this configurable, we are giving a choice to the operators to
> >>>> fetch directly from origin under such scenarios.
> >>>>
> >>>>
> >>>> Question:
> >>>> ------------
> >>>> Originally it was thought of as a new column in Deliveryservice table
> >>>> (Go_Direct). But then Rawlin suggested adding a new delivery serice type
> >>> to
> >>>> avoid some of the conflict scenario disccussed in the PR. Eric and Rob
> >>>> feels that adding new DS type is not a desired one.
> >>>>
> >>>> Request your views to get a consensus on the suitable approach for this.
> >>>>
> >>>> Thanks,
> >>>> Vijayanand S
> >>>>
> >>>>
> >>>>
> >>>> ---------- Forwarded message ----------
> >>>> From: Vijay Anand <vijayanand.jayaman...@gmail.com>
> >>>> Date: Thu, Jun 7, 2018 at 6:45 PM
> >>>> Subject: Re: Making parent.config's go_direct directive configurable via
> >>>> Delivery service
> >>>> To: dev@trafficcontrol.apache.org
> >>>>
> >>>>
> >>>> Rawlin,
> >>>>
> >>>> Yes, I am using a version which Eric referred to (i.e) Cisco's version.
> >>> And
> >>>> looks like in this code it is actually possible to create MSO groups
> >>>> (origin server) which may not contain the org_serv_fqdn. So do you think,
> >>>> MSO enabling and Go Direct = True as mutually exclusive will work?
> >>>>
> >>>> Thanks,
> >>>> Vijayanand S
> >>>>
> >>>> On Mon, Jun 4, 2018 at 8:13 PM, Rawlin Peters <rawlin.pet...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Also, it's very possible this might be a nuance between the version of
> >>>>> TO you are running and the vanilla open source TO. We don't assign
> >>>>> delivery services to "groups"; we assign them to individual servers
> >>>>> (although that might mean every single server in a cachegroup). I
> >>>>> think I recall Eric mentioning "server groups" at Cisco. Might that be
> >>>>> the case?
> >>>>>
> >>>>> - Rawlin
> >>>>>
> >>>>> On Mon, Jun 4, 2018 at 8:36 AM, Rawlin Peters <rawlin.pet...@gmail.com>
> >>>>> wrote:
> >>>>>> It might be possible to create an MSO delivery service with an
> >>>>>> `orgServerFqdn` that doesn't match the [hostname + domain name] of a
> >>>>>> server it's assigned to, but I'm not sure it would work as expected.
> >>>>>> In a bug I recently fixed, I found a bit of code [1] that makes me
> >>>>>> think that scenario would lead to a bad parent.config. The issue [2]
> >>>>>> was that having a trailing slash in `orgServerFqdn` led to an empty
> >>>>>> parent field in parent.config (because before the fix, TO didn't find
> >>>>>> an origin Server that matched the `orgServerFqdn` of the MSO delivery
> >>>>>> service.
> >>>>>>
> >>>>>> - Rawlin
> >>>>>>
> >>>>>> [1] https://github.com/apache/incubator-trafficcontrol/blob/mast
> >>>>> er/traffic_ops/app/lib/API/Configs/ApacheTrafficServer.pm#L2407
> >>>>>> [2] https://github.com/apache/incubator-trafficcontrol/issues/2062
> >>>>>>
> >>>>>> On Fri, Jun 1, 2018 at 6:57 AM, Vijay Anand
> >>>>>> <vijayanand.jayaman...@gmail.com> wrote:
> >>>>>>> Hi Rawlin,
> >>>>>>>
> >>>>>>> I thought, we can always create an origin server group (for MSO)
> >>> which
> >>>>> wont
> >>>>>>> necessarily contain the origin server which we configure while
> >>> creating
> >>>>> a
> >>>>>>> DS and that is the logic behind adding this conflict.  " MSO = true
> >>> and
> >>>>>>> go_direct = true". IF that is not the case, we dont need this
> >>> conflict.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Vijayanand S
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, May 31, 2018 at 9:08 PM, Rawlin Peters <
> >>> rawlin.pet...@gmail.com
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> When configuring MSO currently (note this will change soon once I
> >>> get
> >>>>>>>> the MSO implementation refactored to use Origins rather than
> >>> Servers),
> >>>>>>>> you have to set `orgServerFqdn` to an FQDN that matches at least one
> >>>>>>>> Server with a combined (hostname, domain_name) matching that
> >>>>>>>> `orgServerFqdn` used in the Delivery Service. That Delivery Service
> >>> is
> >>>>>>>> then assigned to that Server and any other Server acting as an
> >>> origin.
> >>>>>>>> So the request flow for an MSO Delivery Service should still flow to
> >>>>>>>> both the origin Server matching the DS's `orgServerFqdn` and any
> >>> other
> >>>>>>>> origin Server that is assigned that DS.
> >>>>>>>>
> >>>>>>>> - Rawlin
> >>>>>>>>
> >>>>>>>> On Thu, May 31, 2018 at 9:12 AM, Vijay Anand
> >>>>>>>> <vijayanand.jayaman...@gmail.com> wrote:
> >>>>>>>>> Hi Rawlin,
> >>>>>>>>>
> >>>>>>>>> Adding CNAME Alias for sharing an origin to resolve the conflicts
> >>>>> looks
> >>>>>>>>> good and it should work. So that , if a conflict is detected, the
> >>>>>>>> operator
> >>>>>>>>> has to setup up CNAME alias for that particular origin.
> >>>>>>>>>
> >>>>>>>>> For MSO, when some one configures an MSO, he would probably meant
> >>> to
> >>>>> use
> >>>>>>>> an
> >>>>>>>>> origin server(s) which is different from Delivery service's
> >>>>> orgin_fqdn.
> >>>>>>>> So
> >>>>>>>>> when go_direct is true and mid cache is offline, Edge will go to
> >>>>> Delivery
> >>>>>>>>> service configured orgin_fqdn which is not an intented behaviour
> >>> if
> >>>>> some
> >>>>>>>>> one configured MSO.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Vijayanand S
> >>>>>>>>>
> >>>>>>>>> On Thu, May 31, 2018 at 3:17 AM, Rawlin Peters <
> >>>>> rawlin.pet...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Vijayanand S,
> >>>>>>>>>>
> >>>>>>>>>> Generally we've found it's bad practice to have multiple delivery
> >>>>>>>>>> services sharing the same origin due to the conflicts in
> >>>>> configuration
> >>>>>>>>>> on the caches serving those delivery services like you mentioned.
> >>>>> But
> >>>>>>>>>> this can be fixed by setting up CNAME DNS records for the shared
> >>>>>>>>>> origin and using a distinct CNAME in each delivery service. In
> >>> fact
> >>>>>>>>>> I've discussed duplicate origins here fairly recently due to my
> >>>>> effort
> >>>>>>>>>> to refactor the Origin implementation, and the tentative plan
> >>> was to
> >>>>>>>>>> phase-in a uniqueness constraint on Origin FQDN so that there
> >>> will
> >>>>> be
> >>>>>>>>>> no possibility of conflicts that we experience today with
> >>> duplicate
> >>>>>>>>>> Origin FQDNs.
> >>>>>>>>>>
> >>>>>>>>>> Would that fix your issue?
> >>>>>>>>>>
> >>>>>>>>>> The `go_direct` option isn't hardcoded per se but is determined
> >>> by
> >>>>> the
> >>>>>>>>>> delivery service type in order to bypass or use the mid tier. So
> >>> for
> >>>>>>>>>> HTTP_NO_CACHE, HTTP_LIVE, and DNS_LIVE, we bypass the mid tier
> >>>>> because
> >>>>>>>>>> that's what those types are for. Why do you want to bypass the
> >>> mid
> >>>>>>>>>> tier for MSO?
> >>>>>>>>>>
> >>>>>>>>>> - Rawlin
> >>>>>>>>>>
> >>>>>>>>>> On Wed, May 30, 2018 at 8:58 AM, Vijay Anand
> >>>>>>>>>> <vijayanand.jayaman...@gmail.com> wrote:
> >>>>>>>>>>> Hi All,
> >>>>>>>>>>>
> >>>>>>>>>>> Planning for a PR on making parent.config's go_direct directive
> >>>>>>>>>>> configurable via Delivery service. Right now, go_direct is
> >>>>>>>>>>> being hard coded.
> >>>>>>>>>>>
> >>>>>>>>>>> Given below is a brief write up on the implementing the same.
> >>>>>>>>>>>
> >>>>>>>>>>> Assumption:
> >>>>>>>>>>> All DS-es sharing an origin server should have same value for
> >>>>>>>> go_direct.
> >>>>>>>>>>>
> >>>>>>>>>>> Implementation:
> >>>>>>>>>>> Add a new column 'go_direct' in Deliveryservice table. Its
> >>> value
> >>>>>>>> defaults
> >>>>>>>>>>> to False.
> >>>>>>>>>>> Delivery service UI (Traffic Ops) and API will be enhanced to
> >>>>> support
> >>>>>>>>>> this
> >>>>>>>>>>> new column.
> >>>>>>>>>>>
> >>>>>>>>>>> Conflicts:
> >>>>>>>>>>> 1. DS Type HTTP_NO_CACHE and go direct False
> >>>>>>>>>>> 2. DS Type HTTP_LIVE and go direct False
> >>>>>>>>>>> 3. DS Type DNS_LIVE and go direct False
> >>>>>>>>>>> 4. MSO = true and go_direct = true
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> When we have more than 2 DS-es sharing same origin, then
> >>> updating
> >>>>> one
> >>>>>>>>>>> particular DS's go_direct value will result
> >>>>>>>>>>> in conflict, since all DS-es sharing an origin should have same
> >>>>> value
> >>>>>>>> for
> >>>>>>>>>>> go_direct.
> >>>>>>>>>>>
> >>>>>>>>>>> Such conflicts should be resolved by deleting and recreating
> >>> these
> >>>>>>>> DS-es
> >>>>>>>>>>> with new value for go_direct.
> >>>>>>>>>>>
> >>>>>>>>>>> This method of deleting and recreating DS-es is preferred over
> >>>>>>>> updating
> >>>>>>>>>> all
> >>>>>>>>>>> the affected DS-es implicitly.
> >>>>>>>>>>>
> >>>>>>>>>>> would like your comments on this.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Vijayanand S
> >>>>>>>>>>
> >>>>>>>>
> >>>>>
> >>>
>

Reply via email to