>> 1. Prohibit creating new delivery services that would share an existing origin and prohibit updating a delivery service to a shared origin In case my position has been lost, I'm still -1 on this. :) IMO, we shouldn't let the few complicated use cases increase the barrier to entry (to using ATC) for the masses.
On Fri, Jan 11, 2019 at 11:53 AM Rawlin Peters <rawlin.pet...@gmail.com> wrote: > Alright, I'm trying to sum up this discussion so far since it seems > like everyone went on vacation and didn't really get a chance to wrap > this one up: > - duplicate origins cause undefined behavior > - we need a way to migrate to a future that is free of duplicate > origins in Traffic Control > - we need a visual and easy way to determine if Traffic Ops currently > contains duplicate origins, so that operators are incentivized to fix > them rather than let it slide indefinitely > - operators should have a fair amount of time to fix their duplicate > origins > > I believe this is what we've mostly agreed upon but haven't clearly voted > on: > > In release N we will: > 1. Prohibit creating new delivery services that would share an > existing origin and prohibit updating a delivery service to a shared > origin > 2. Add some kind of visual indicator that duplicate origins are a > problem that need to be fixed before release N+1; otherwise, an > upgrade to N+1 will be prohibited. > > In release N+1 we will: > 3. Include a DB migration that adds a uniqueness constraint on origin > FQDN, removing the API-level checks for that. > 4. Prevent an upgrade to N+1 if duplicate origins are found (this > might occur as a byproduct of step 3). > > I am +1 on this plan and believe this would hit on all the summarized > points above. Please provide a clear vote on this plan so that we can > dive deeper in the details (i.e. what release 'N' is, the best visual > indicator for step 2, and a friendly way to handle step 4). > > Thanks, > Rawlin > > On Wed, Dec 19, 2018 at 3:39 PM Jeremy Mitchell <mitchell...@gmail.com> > wrote: > > > > Not sure TP is the right place for a warning like "clean up this > > 'duplicate' origin or your next upgrade will fail". Most users of our > > system will be like "not my problem". > > > > On Wed, Dec 19, 2018 at 11:58 AM Fieck, Brennan < > brennan_fi...@comcast.com> > > wrote: > > > > > Probably. It would impact load times by a bit, but the page for an > > > individual object is not our bottleneck. > > > ________________________________________ > > > From: Robert Butts <r...@apache.org> > > > Sent: Wednesday, December 19, 2018 11:50 AM > > > To: dev@trafficcontrol.apache.org > > > Subject: Re: [EXTERNAL] Re: Origins assigned to Multipe Delivery > Services > > > produces indeterminate parent.config > > > > > > > - Including a warning on startup and an API constraint preventing > adding > > > more bad data in the next 3.0.0 Release Candidate > > > > - Adding a database constraint immediately into master that won't be > > > cherry-picked into 3.0.0 but should be included in 3.1.0 > > > > > > +1 > > > > > > I understand Jonathan's objection, but at some point, we have to be > able to > > > move forward. This is a good compromise: deprecate, then remove. That > gives > > > people a full major version to fix their data. > > > > > > I would be ideal if it were more than just a logged warning, though. > Can we > > > add a big red banner in Traffic Portal, on the Delivery Service page > for > > > any DS with a duplicate origin, telling users to fix it, and that they > > > won't be able to upgrade to the next major version until it's fixed? > > > > > > > > > On Wed, Dec 19, 2018 at 10:57 AM Fieck, Brennan < > brennan_fi...@comcast.com > > > > > > > wrote: > > > > > > > So it seems like nobody has a problem with the "how" - disallowing > > > > duplicate origin FQDNs on Delivery Services - but we never reached a > > > > consensus on "when". > > > > > > > > I stand by my previous proposal: > > > > - Including a warning on startup and an API constraint preventing > adding > > > > more bad data in the next 3.0.0 Release Candidate > > > > - Adding a database constraint immediately into master that won't be > > > > cherry-picked into 3.0.0 but should be included in 3.1.0 > > > > ________________________________________ > > > > From: Rawlin Peters <rawlin.pet...@gmail.com> > > > > Sent: Tuesday, December 18, 2018 4:59 PM > > > > To: dev@trafficcontrol.apache.org > > > > Subject: Re: [EXTERNAL] Re: Origins assigned to Multipe Delivery > Services > > > > produces indeterminate parent.config > > > > > > > > Also, building more around DS types will make it even harder to get > > > > away from DS types in the future too, which I know is something we've > > > > discussed on this mailing list before. It also adds to the overhead > of > > > > Delivery Service Topologies, since a lot of the DS types won't > > > > carryover into that world. > > > > > > > > - Rawlin > > > > > > > > On Tue, Dec 18, 2018 at 2:42 PM Fieck, Brennan > > > > <brennan_fi...@comcast.com> wrote: > > > > > > > > > > +1. > > > > > If there's a simple way to work around duplicate origins being > > > > prohibited, > > > > > then we should rely on that instead of "enumerating all those > possible > > > > conflicting > > > > > settings, which are not only highly complex and confusing, but also > > > > further > > > > > entrench us in only supporting ATS as a caching proxy (hurting > efforts > > > to > > > > > integrate e.g. Grove, nginx etc.) > > > > > ________________________________________ > > > > > From: Rawlin Peters <rawlin.pet...@gmail.com> > > > > > Sent: Tuesday, December 18, 2018 2:20 PM > > > > > To: dev@trafficcontrol.apache.org > > > > > Subject: Re: [EXTERNAL] Re: Origins assigned to Multipe Delivery > > > > Services produces indeterminate parent.config > > > > > > > > > > There are a number of different DS settings at play that can > > > > > potentially cause conflicts. The question is: do we want to get > into > > > > > the business of enumerating all those possible conflicting > settings or > > > > > just simply prohibit duplicate origins altogether? I think we can > dig > > > > > in and get that "sufficiently advanced sql query" to check for > > > > > conflicting origins, but is that something we want to carry along > for > > > > > the foreseeable future? Aren't CNAMEs relatively cheaper than > > > > > developing and maintaining that code and the mental overhead > required > > > > > in understanding why you're getting an error that says your > requested > > > > > DS would cause an origin conflict? I think at the point you've > > > > > requested a DS that would create a conflict, you've chosen those > > > > > settings for a reason and would probably prefer to just create/use > a > > > > > CNAME in your new DS and keep the rest of your settings the same. > > > > > > > > > > Thinking in terms of errors, I'm imagining: > > > > > "cannot create delivery service: origin fqdn 'foo.example.com' > already > > > > in use" > > > > > vs > > > > > "cannot create delivery service: origin fqdn 'foo.example.com' > already > > > > > in use as type DNS_LIVE_NATNL, which is incompatible with your > chosen > > > > > type of HTTP_NO_CACHE" > > > > > > > > > > At that point you'd probably say to yourself, "uh, I need > > > > > HTTP_NO_CACHE, so what am I supposed to do now?" > > > > > > > > > > As a lazy developer I'm +1 on prohibiting duplicate origin fqdns > > > > > because the resulting code will be simpler, but I think eliminating > > > > > the mental overhead for operators could be worthwhile too. If we > can > > > > > agree on an end state of prohibiting duplicate origins altogether, > we > > > > > can start working on a design to smoothly transition us to that > point. > > > > > Are we willing to live with "just CNAME your origin fqdn" as the > > > > > standard solution to duplicates? > > > > > > > > > > - Rawlin > > > > > > > > > > > > > > > On Tue, Dec 18, 2018 at 1:27 PM Gelinas, Derek > > > > > <derek_geli...@comcast.com> wrote: > > > > > > > > > > > > The only situation in which they can share origins is if a) the > > > > origins are shared in an MSO configuration but still have different > > > defined > > > > origin fields in the delivery service, or if they're assigned to > > > completely > > > > different cachegroups. It's when two delivery services share the > same > > > > edges that there's an issue, because you end up with parent.config > > > issues. > > > > Actually you could even get away with it in mids as long as you > weren't > > > > doing anything like MSO to it. > > > > > > > > > > > > Could get messy real fast, though. Best to just create a second > > > FQDN. > > > > > > > > > > > > Derek > > > > > > > > > > > > On 12/18/18, 3:23 PM, "Fieck, Brennan" < > brennan_fi...@comcast.com> > > > > wrote: > > > > > > > > > > > > So no two Delivery Services may share an origin *regardless > of > > > > cache hierarchy* ? I've been told that DNS Delivery Services can > have the > > > > same origin as HTTP Delivery Services because they obey the same > cache > > > > hierarchy. You're saying that would still produce invalid output > and/or > > > is > > > > explicitly disallowed by ATS? > > > > > > ________________________________________ > > > > > > From: Robert Butts <r...@apache.org> > > > > > > Sent: Tuesday, December 18, 2018 1:09 PM > > > > > > To: dev@trafficcontrol.apache.org > > > > > > Subject: Re: [EXTERNAL] Re: Origins assigned to Multipe > Delivery > > > > Services produces indeterminate parent.config > > > > > > > > > > > > >can you give an example of what parent.config looks like > when 2 > > > > ds's share > > > > > > an origin and have different a different topology? > > > > > > > > > > > > Answering because I encountered this directly, when rewriting > > > > parent.config. > > > > > > > > > > > > For example: Suppose you have one Delivery Service: > > > > > > XML_ID: foo > > > > > > Type: HTPT_LIVE_NATL > > > > > > Query String Handling: 1 - ignore in cache key, and pass up > > > > > > Origin Server Base URL: http://foo.example.net > > > > > > > > > > > > And another Delivery Service: > > > > > > XML_ID: bar > > > > > > Type: HTPT_LIVE_NATL > > > > > > Query String Handling: 1 - ignore in cache key, and pass up > > > > > > Origin Server Base URL: http://foo.example.net > > > > > > > > > > > > ATS only supports unique `dest_domain` entries in > parent.config. > > > > Therefore, > > > > > > the parent.config generated for a server assigned to both of > > > these > > > > Delivery > > > > > > Services with either be: > > > > > > > > > > > > dest_domain=foo.example.net port=80 go_direct=true > > > > > > > > > > > > Or > > > > > > > > > > > > dest_domain=foo.example.net port=80 go_direct=false > > > > qstring=consider > > > > > > > > > > > > Right now, it's arbitrary which Perl Traffic Ops inserts, and > > > Perl > > > > provides > > > > > > no warning or error of any kind (the pending Go > parent.config PR > > > > logs an > > > > > > error). > > > > > > > > > > > > Whichever is arbitrarily inserted, the resulting remaps for > the > > > > other > > > > > > delivery service will be wrong. Either `foo` requests will > drop > > > > the query > > > > > > string when they shouldn't, and go to the mid when they > > > shouldn't; > > > > or `bar` > > > > > > requests will use the query string and skip the mid when it > > > > shouldn't. > > > > > > > > > > > > > > > > > > Does that make sense? The only correct solution, is to > somehow > > > > prevent > > > > > > different DSes having the same origin, and tell tenants they > must > > > > use > > > > > > CNAMEs if they need. > > > > > > > > > > > > This isn't a bug in Traffic Control. ATS fundamentally > doesn't > > > > support > > > > > > multiple remap rules with the same parent FQDN with different > > > > > > configurations. Hence, Traffic Control needs to prohibit > that. > > > > > > > > > > > > > > > > > > On Tue, Dec 18, 2018 at 12:24 PM Jeremy Mitchell < > > > > mitchell...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > brennan, > > > > > > > > > > > > > > can you give an example of what parent.config looks like > when 2 > > > > ds's share > > > > > > > an origin and have different a different topology? > > > > > > > > > > > > > > jeremy > > > > > > > > > > > > > > On Tue, Dec 18, 2018 at 11:39 AM Fieck, Brennan < > > > > brennan_fi...@comcast.com > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > To be clear, the "Warning" I'm talking about would > happen at > > > > startup, but > > > > > > > > I'd like a UI-only constraint to come with that to > disallow > > > > using the API > > > > > > > > to bind the same origin to multiple Delivery Services > with > > > > varying > > > > > > > > topography requirements. It wouldn't change the existing > > > data, > > > > but > > > > > > > prevent > > > > > > > > users from creating more bad data. > > > > > > > > > > > > > > > > "warning" doesn't really sufficiently describe that, my > bad. > > > > > > > > ________________________________________ > > > > > > > > From: Fieck, Brennan <brennan_fi...@comcast.com> > > > > > > > > Sent: Tuesday, December 18, 2018 11:24 AM > > > > > > > > To: dev@trafficcontrol.apache.org > > > > > > > > Subject: Re: [EXTERNAL] Re: Origins assigned to Multipe > > > > Delivery Services > > > > > > > > produces indeterminate parent.config > > > > > > > > > > > > > > > > Well the cost of fixing this bug is a constraint on the > data. > > > > Unless we > > > > > > > > make it a UI-only constraint - which I'm personally > against - > > > > there must > > > > > > > be > > > > > > > > some point in the future where ATC cannot reasonably be > > > > expected to work > > > > > > > > with data that violates that constraint. The question is > when > > > > that should > > > > > > > > occur, which should likely happen at a minor version > release. > > > > Minor not > > > > > > > > major because it doesn't involve a change in data > structures, > > > > merely > > > > > > > > relationships between them - in my opinion that's a minor > > > > version change > > > > > > > > but that's definitely up for debate. With several release > > > > candidates for > > > > > > > > 3.0.0 that _doesn't_ include this restriction already in > the > > > > wild, I > > > > > > > > wouldn't recommend putting it in there. That means to > fix the > > > > bug as soon > > > > > > > > as possible it should go in 3.1.0 which should be the > target > > > > of "master" > > > > > > > > after the 3.0.0 release is cut from it. > > > > > > > > > > > > > > > > So I'd recommend immediately implementing the constraint > in > > > > master with a > > > > > > > > refusal to upgrade with bad data, and backport a warning > > > about > > > > the future > > > > > > > > behavior into 3.0.0 or as part of a 3.0.1 provided we had > > > more > > > > changes > > > > > > > that > > > > > > > > would warrant a micro version bump. > > > > > > > > ________________________________________ > > > > > > > > From: Gray, Jonathan <jonathan_g...@comcast.com> > > > > > > > > Sent: Tuesday, December 18, 2018 9:34 AM > > > > > > > > To: dev@trafficcontrol.apache.org > > > > > > > > Subject: Re: [EXTERNAL] Re: Origins assigned to Multipe > > > > Delivery Services > > > > > > > > produces indeterminate parent.config > > > > > > > > > > > > > > > > -1 Holding an ATC upgrade hostage to data cleanup seems > like > > > a > > > > bad idea. > > > > > > > > The issue isn't great, but it's also not new. We should > > > allow > > > > teams to > > > > > > > fix > > > > > > > > their data at their normal paces if it doesn't create > > > > significant > > > > > > > overhead > > > > > > > > or an inherant blocker for new functionality or > correction of > > > > other major > > > > > > > > problems imho. > > > > > > > > > > > > > > > > Jonathan G > > > > > > > > > > > > > > > > > > > > > > > > On 12/18/18, 9:28 AM, "Fieck, Brennan" < > > > > brennan_fi...@comcast.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > Another option is we could detect collisions at > startup > > > > and simply > > > > > > > > refuse to continue with the upgrade until the data is > fixed. > > > > That would > > > > > > > > allow people using the now-unsupported data format to > > > continue > > > > to use > > > > > > > their > > > > > > > > old versions of Traffic Ops without wrecking their > database, > > > > but also > > > > > > > > provide an incentive to clean up the data. > > > > > > > > ________________________________________ > > > > > > > > From: Gray, Jonathan <jonathan_g...@comcast.com> > > > > > > > > Sent: Tuesday, December 18, 2018 5:12 AM > > > > > > > > To: dev@trafficcontrol.apache.org > > > > > > > > Subject: Re: [EXTERNAL] Re: Origins assigned to > Multipe > > > > Delivery > > > > > > > > Services produces indeterminate parent.config > > > > > > > > > > > > > > > > I'm generally a fan of constrain your data in your > > > > database, but not > > > > > > > > necessarily exclusively. I see this as a one-way > > > > cleanup/conversion so > > > > > > > it > > > > > > > > doesn't need to be configurable; otherwise you have to > ask > > > the > > > > question > > > > > > > > what happens if someone turns it off. That said, > something > > > in > > > > the UI > > > > > > > layer > > > > > > > > would be nice to prevent spending significant quantities > of > > > > time > > > > > > > building a > > > > > > > > complex DS only to have it fail to post for reasons that > > > could > > > > have been > > > > > > > > known earlier. > > > > > > > > > > > > > > > > The way my brain works in this case: > > > > > > > > If !unique_constraint_exists_query() > > > > > > > > If has_duplicates_query() > > > > > > > > show_warning() > > > > > > > > else > > > > > > > > add_unique_constraint() > > > > > > > > > > > > > > > > to which the API and UI configuration could also > make use > > > > of > > > > > > > > unique_constraint_exists_query() to drive additional > layer > > > > constraints if > > > > > > > > desired. > > > > > > > > > > > > > > > > Jonathan G > > > > > > > > > > > > > > > > On 12/17/18, 1:11 PM, "Rawlin Peters" < > > > > rawlin.pet...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > That is an interesting idea...detect at TO > startup > > > > whether or not > > > > > > > > there are duplicate origins and operate in a > "prevent > > > > duplicate > > > > > > > > origins" state if no duplicates are found or > "prevent > > > > conflicting > > > > > > > > DS > > > > > > > > topologies" state if duplicates are found? So > once > > > > operators have > > > > > > > > replaced all the duplicate origins with CNAMEs, > TO > > > will > > > > > > > essentially > > > > > > > > operate in a "prohibit all duplicate origins" > state. > > > > That would > > > > > > > > probably make for a simpler transition, but I'd > want > > > > to remove > > > > > > > that > > > > > > > > logic in a following release that strictly > prohibits > > > > duplicate > > > > > > > > origins > > > > > > > > (assuming that the community agrees we should > > > prohibit > > > > duplicate > > > > > > > > origins altogether). > > > > > > > > > > > > > > > > As for DB constraints vs UI, I was thinking those > > > > DS-type > > > > > > > > constraints > > > > > > > > I pointed out would live in the API. It would > > > > basically be added > > > > > > > > validation in the deliveryservices POST/PUT > endpoint > > > > that checks > > > > > > > > the > > > > > > > > DB for existing DSes that conflict with the > requested > > > > DS. > > > > > > > > > > > > > > > > - Rawlin > > > > > > > > > > > > > > > > On Mon, Dec 17, 2018 at 12:35 PM Gray, Jonathan > > > > > > > > <jonathan_g...@comcast.com> wrote: > > > > > > > > > > > > > > > > > > These kinds of conditions should be detectable > > > with a > > > > > > > > sufficiently advanced SQL query. Is it possible to add > the > > > > constraint if > > > > > > > > it passes and emit a warning during TO startup otherwise? > > > > That would let > > > > > > > > you know the condition exists at startup but not getting > in > > > > your way and > > > > > > > > keep you out of trouble once you've cleaned up. We made > a > > > > mistake early > > > > > > > > on, but this would acknowledge it was bad and encourage > it to > > > > be fixed at > > > > > > > > the speed of operations teams. Also this puts the > constraint > > > > in the > > > > > > > > database rather than the UI which is really where the > > > > contention is for > > > > > > > > usability. > > > > > > > > > > > > > > > > > > Jonathan G > > > > > > > > > > > > > > > > > > > > > > > > > > > On 12/17/18, 11:38 AM, "Rawlin Peters" < > > > > > > > rawlin.pet...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > We occasionally discuss this issue but > haven't > > > > tackled it > > > > > > > > yet. I think > > > > > > > > > the main issue is just that duplicate > origins > > > > have been > > > > > > > > allowed since > > > > > > > > > the beginning, and now everyone's Traffic > Ops > > > > could be > > > > > > > > littered with > > > > > > > > > duplicate origins. Also, depending on the > > > config > > > > of the > > > > > > > > duplicate > > > > > > > > > delivery services, the origins might not > be in > > > > conflict at > > > > > > > > all (if > > > > > > > > > they don't have different topology > > > constraints). > > > > I would > > > > > > > > love for us > > > > > > > > > to just add a uniqueness constraint, but > there > > > > would need > > > > > > > to > > > > > > > > be a fair > > > > > > > > > amount of warning to the community before > doing > > > > so and > > > > > > > might > > > > > > > > > invalidate a significant amount of valid > use > > > > cases. > > > > > > > > Operators would > > > > > > > > > need time to make DNS CNAME records for the > > > > duplicate > > > > > > > > origins and > > > > > > > > > update their DSes to use the different > CNAMEs. > > > > > > > > > > > > > > > > > > I think as a good first step to > eliminating the > > > > use of > > > > > > > > duplicate > > > > > > > > > origins altogether, we should identify > which > > > > "topology > > > > > > > > constraints" > > > > > > > > > actually cause conflicting config when used > > > with > > > > duplicate > > > > > > > > origins and > > > > > > > > > prevent creating DSes with duplicate > origins > > > _if > > > > it would > > > > > > > > cause a > > > > > > > > > conflict with an existing DS that uses the > same > > > > origin_. > > > > > > > > > > > > > > > > > > For instance, I believe an HTTP and > DNS-type DS > > > > can live > > > > > > > > happily > > > > > > > > > side-by-side using the same origin > (probably > > > > need different > > > > > > > > > routing_names?), but scenarios like HTTP > and > > > > HTTP_LIVE, or > > > > > > > > DNS and > > > > > > > > > HTTP_NO_CACHE sharing the same origin will > > > cause > > > > conflicts > > > > > > > > for sure. > > > > > > > > > So maybe we can start by making sure the DS > > > > types "match" > > > > > > > > when using > > > > > > > > > the same origin: > > > > > > > > > HTTP + DNS: possibly good, if they have > > > > different routing > > > > > > > > names? > > > > > > > > > HTTP_LIVE + HTTP_LIVE_NATNL: bad > > > > > > > > > HTTP_NO_CACHE + [any other type]: bad > > > > > > > > > HTTP_LIVE + HTTP: bad > > > > > > > > > etc. > > > > > > > > > > > > > > > > > > There are most likely other conflict > scenarios > > > > that don't > > > > > > > > involve the > > > > > > > > > DS types, but I think this would be a good > > > > start. In the > > > > > > > > future with > > > > > > > > > Delivery Service Topologies (aka Flexible > > > > Cachegroups aka > > > > > > > > Bring Your > > > > > > > > > Own Topology), we might be able to prohibit > > > > assigning a DS > > > > > > > > to a > > > > > > > > > Topology if the DS's origin is already > used by > > > > another DS > > > > > > > in > > > > > > > > a > > > > > > > > > different Topology. > > > > > > > > > > > > > > > > > > - Rawlin > > > > > > > > > > > > > > > > > > On Mon, Dec 17, 2018 at 10:52 AM Fieck, > Brennan > > > > > > > > > <brennan_fi...@comcast.com> wrote: > > > > > > > > > > > > > > > > > > > > As some of you may be aware, > `parent.config` > > > > files > > > > > > > > generated by Traffic Ops can vary wildly when an origin > is > > > > assigned to > > > > > > > > multiple Delivery Services. This results in undefined > > > > behavior. I'm told > > > > > > > > that the conflict only happens when two Delivery Services > > > with > > > > different > > > > > > > > "topology requirements" use the same origin, whatever > that > > > > means (content > > > > > > > > routing type?). Regardless, the issue should be > addressed. > > > The > > > > obvious > > > > > > > > solution is to put in place a database constraint that > > > > prevents an origin > > > > > > > > from being assigned to more that one Delivery Service > with > > > API > > > > checks in > > > > > > > > place that would provide helpful error messages when an > > > > attempt is made > > > > > > > to > > > > > > > > violate the constraint. However, would that mess with > things > > > > like > > > > > > > > Multi-Site Origin? Or is it just not viable for some > other > > > > reason? If it > > > > > > > is > > > > > > > > a good solution, I'm prepared to work on a fix that > utilizes > > > > it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >