Here's a SQL query, to find duplicate origins on different delivery
services:

```
WITH duplicate_origins as (
  SELECT fqdn FROM origin
  where is_primary
  GROUP BY fqdn
  HAVING COUNT(*) > 1
)
SELECT o.fqdn, ds.xml_id AS ds_name
FROM origin o
JOIN duplicate_origins du on du.fqdn = o.fqdn
JOIN deliveryservice ds ON ds.id = o.deliveryservice
WHERE o.is_primary
ORDER BY fqdn;
```


On Tue, Dec 18, 2018 at 1:09 PM Robert Butts <r...@apache.org> wrote:

> >can you give an example of what parent.config looks like when 2 ds's
> share an origin and have different a different topology?
>
> Answering because I encountered this directly, when rewriting
> parent.config.
>
> For example: Suppose you have one Delivery Service:
> XML_ID: foo
> Type: HTPT_LIVE_NATL
> Query String Handling: 1 - ignore in cache key, and pass up
> Origin Server Base URL: http://foo.example.net
>
> And another Delivery Service:
> XML_ID: bar
> Type: HTPT_LIVE_NATL
> Query String Handling: 1 - ignore in cache key, and pass up
> Origin Server Base URL: http://foo.example.net
>
> ATS only supports unique `dest_domain` entries in parent.config.
> Therefore, the parent.config generated for a server assigned to both of
> these Delivery Services with either be:
>
> dest_domain=foo.example.net port=80 go_direct=true
>
> Or
>
> dest_domain=foo.example.net port=80 go_direct=false qstring=consider
>
> Right now, it's arbitrary which Perl Traffic Ops inserts, and Perl
> provides no warning or error of any kind (the pending Go parent.config PR
> logs an error).
>
> Whichever is arbitrarily inserted, the resulting remaps for the other
> delivery service will be wrong. Either `foo` requests will drop the query
> string when they shouldn't, and go to the mid when they shouldn't; or `bar`
> requests will use the query string and skip the mid when it shouldn't.
>
>
> Does that make sense? The only correct solution, is to somehow prevent
> different DSes having the same origin, and tell tenants they must use
> CNAMEs if they need.
>
> This isn't a bug in Traffic Control. ATS fundamentally doesn't support
> multiple remap rules with the same parent FQDN with different
> configurations. Hence, Traffic Control needs to prohibit that.
>
>
> On Tue, Dec 18, 2018 at 12:24 PM Jeremy Mitchell <mitchell...@gmail.com>
> wrote:
>
>> brennan,
>>
>> can you give an example of what parent.config looks like when 2 ds's share
>> an origin and have different a different topology?
>>
>> jeremy
>>
>> On Tue, Dec 18, 2018 at 11:39 AM Fieck, Brennan <
>> brennan_fi...@comcast.com>
>> wrote:
>>
>> > To be clear, the "Warning" I'm talking about would happen at startup,
>> but
>> > I'd like a UI-only constraint to come with that to disallow using the
>> API
>> > to bind the same origin to multiple Delivery Services with varying
>> > topography requirements. It wouldn't change the existing data, but
>> prevent
>> > users from creating more bad data.
>> >
>> > "warning" doesn't really sufficiently describe that, my bad.
>> > ________________________________________
>> > From: Fieck, Brennan <brennan_fi...@comcast.com>
>> > Sent: Tuesday, December 18, 2018 11:24 AM
>> > To: dev@trafficcontrol.apache.org
>> > Subject: Re: [EXTERNAL] Re: Origins assigned to Multipe Delivery
>> Services
>> > produces indeterminate parent.config
>> >
>> > Well the cost of fixing this bug is a constraint on the data. Unless we
>> > make it a UI-only constraint - which I'm personally against - there
>> must be
>> > some point in the future where ATC cannot reasonably be expected to work
>> > with data that violates that constraint. The question is when that
>> should
>> > occur, which should likely happen at a minor version release. Minor not
>> > major because it doesn't involve a change in data structures, merely
>> > relationships between them - in my opinion that's a minor version change
>> > but that's definitely up for debate. With several release candidates for
>> > 3.0.0 that _doesn't_ include this restriction already in the wild, I
>> > wouldn't recommend putting it in there. That means to fix the bug as
>> soon
>> > as possible it should go in 3.1.0 which should be the target of "master"
>> > after the 3.0.0 release is cut from it.
>> >
>> > So I'd recommend immediately implementing the constraint in master with
>> a
>> > refusal to upgrade with bad data, and backport a warning about the
>> future
>> > behavior into 3.0.0 or as part of a 3.0.1 provided we had more changes
>> that
>> > would warrant a micro version bump.
>> > ________________________________________
>> > From: Gray, Jonathan <jonathan_g...@comcast.com>
>> > Sent: Tuesday, December 18, 2018 9:34 AM
>> > To: dev@trafficcontrol.apache.org
>> > Subject: Re: [EXTERNAL] Re: Origins assigned to Multipe Delivery
>> Services
>> > produces indeterminate parent.config
>> >
>> > -1 Holding an ATC upgrade hostage to data cleanup seems like a bad idea.
>> > The issue isn't great, but it's also not new.  We should allow teams to
>> fix
>> > their data at their normal paces if it doesn't create significant
>> overhead
>> > or an inherant blocker for new functionality or correction of other
>> major
>> > problems imho.
>> >
>> > Jonathan G
>> >
>> >
>> > On 12/18/18, 9:28 AM, "Fieck, Brennan" <brennan_fi...@comcast.com>
>> wrote:
>> >
>> >     Another option is we could detect collisions at startup and simply
>> > refuse to continue with the upgrade until the data is fixed. That would
>> > allow people using the now-unsupported data format to continue to use
>> their
>> > old versions of Traffic Ops without wrecking their database, but also
>> > provide an incentive to clean up the data.
>> >     ________________________________________
>> >     From: Gray, Jonathan <jonathan_g...@comcast.com>
>> >     Sent: Tuesday, December 18, 2018 5:12 AM
>> >     To: dev@trafficcontrol.apache.org
>> >     Subject: Re: [EXTERNAL] Re: Origins assigned to Multipe Delivery
>> > Services produces indeterminate parent.config
>> >
>> >     I'm generally a fan of constrain your data in your database, but not
>> > necessarily exclusively.  I see this as a one-way cleanup/conversion so
>> it
>> > doesn't need to be configurable; otherwise you have to ask the question
>> > what happens if someone turns it off.  That said, something in the UI
>> layer
>> > would be nice to prevent spending significant quantities of time
>> building a
>> > complex DS only to have it fail to post for reasons that could have been
>> > known earlier.
>> >
>> >     The way my brain works in this case:
>> >     If !unique_constraint_exists_query()
>> >             If has_duplicates_query()
>> >                     show_warning()
>> >             else
>> >                     add_unique_constraint()
>> >
>> >     to which the API and UI configuration could also make use of
>> > unique_constraint_exists_query() to drive additional layer constraints
>> if
>> > desired.
>> >
>> >     Jonathan G
>> >
>> >     On 12/17/18, 1:11 PM, "Rawlin Peters" <rawlin.pet...@gmail.com>
>> wrote:
>> >
>> >         That is an interesting idea...detect at TO startup whether or
>> not
>> >         there are duplicate origins and operate in a "prevent duplicate
>> >         origins" state if no duplicates are found or "prevent
>> conflicting
>> > DS
>> >         topologies" state if duplicates are found? So once operators
>> have
>> >         replaced all the duplicate origins with CNAMEs, TO will
>> essentially
>> >         operate in a "prohibit all duplicate origins" state. That would
>> >         probably make for a simpler transition, but I'd want to remove
>> that
>> >         logic in a following release that strictly prohibits duplicate
>> > origins
>> >         (assuming that the community agrees we should prohibit duplicate
>> >         origins altogether).
>> >
>> >         As for DB constraints vs UI, I was thinking those DS-type
>> > constraints
>> >         I pointed out would live in the API. It would basically be added
>> >         validation in the deliveryservices POST/PUT endpoint that checks
>> > the
>> >         DB for existing DSes that conflict with the requested DS.
>> >
>> >         - Rawlin
>> >
>> >         On Mon, Dec 17, 2018 at 12:35 PM Gray, Jonathan
>> >         <jonathan_g...@comcast.com> wrote:
>> >         >
>> >         > These kinds of conditions should be detectable with a
>> > sufficiently advanced SQL query.  Is it possible to add the constraint
>> if
>> > it passes and emit a warning during TO startup otherwise?  That would
>> let
>> > you know the condition exists at startup but not getting in your way and
>> > keep you out of trouble once you've cleaned up.  We made a mistake early
>> > on, but this would acknowledge it was bad and encourage it to be fixed
>> at
>> > the speed of operations teams.  Also this puts the constraint in the
>> > database rather than the UI which is really where the contention is for
>> > usability.
>> >         >
>> >         > Jonathan G
>> >         >
>> >         >
>> >         > On 12/17/18, 11:38 AM, "Rawlin Peters" <
>> rawlin.pet...@gmail.com>
>> > wrote:
>> >         >
>> >         >     We occasionally discuss this issue but haven't tackled it
>> > yet. I think
>> >         >     the main issue is just that duplicate origins have been
>> > allowed since
>> >         >     the beginning, and now everyone's Traffic Ops could be
>> > littered with
>> >         >     duplicate origins. Also, depending on the config of the
>> > duplicate
>> >         >     delivery services, the origins might not be in conflict at
>> > all (if
>> >         >     they don't have different topology constraints). I would
>> > love for us
>> >         >     to just add a uniqueness constraint, but there would need
>> to
>> > be a fair
>> >         >     amount of warning to the community before doing so and
>> might
>> >         >     invalidate a significant amount of valid use cases.
>> > Operators would
>> >         >     need time to make DNS CNAME records for the duplicate
>> > origins and
>> >         >     update their DSes to use the different CNAMEs.
>> >         >
>> >         >     I think as a good first step to eliminating the use of
>> > duplicate
>> >         >     origins altogether, we should identify which "topology
>> > constraints"
>> >         >     actually cause conflicting config when used with duplicate
>> > origins and
>> >         >     prevent creating DSes with duplicate origins _if it would
>> > cause a
>> >         >     conflict with an existing DS that uses the same origin_.
>> >         >
>> >         >     For instance, I believe an HTTP and DNS-type DS can live
>> > happily
>> >         >     side-by-side using the same origin (probably need
>> different
>> >         >     routing_names?), but scenarios like HTTP and HTTP_LIVE, or
>> > DNS and
>> >         >     HTTP_NO_CACHE sharing the same origin will cause conflicts
>> > for sure.
>> >         >     So maybe we can start by making sure the DS types "match"
>> > when using
>> >         >     the same origin:
>> >         >     HTTP + DNS: possibly good, if they have different routing
>> > names?
>> >         >     HTTP_LIVE + HTTP_LIVE_NATNL: bad
>> >         >     HTTP_NO_CACHE + [any other type]: bad
>> >         >     HTTP_LIVE + HTTP: bad
>> >         >     etc.
>> >         >
>> >         >     There are most likely other conflict scenarios that don't
>> > involve the
>> >         >     DS types, but I think this would be a good start. In the
>> > future with
>> >         >     Delivery Service Topologies (aka Flexible Cachegroups aka
>> > Bring Your
>> >         >     Own Topology), we might be able to prohibit assigning a DS
>> > to a
>> >         >     Topology if the DS's origin is already used by another DS
>> in
>> > a
>> >         >     different Topology.
>> >         >
>> >         >     - Rawlin
>> >         >
>> >         >     On Mon, Dec 17, 2018 at 10:52 AM Fieck, Brennan
>> >         >     <brennan_fi...@comcast.com> wrote:
>> >         >     >
>> >         >     > As some of you may be aware, `parent.config` files
>> > generated by Traffic Ops can vary wildly when an origin is assigned to
>> > multiple Delivery Services. This results in undefined behavior. I'm told
>> > that the conflict only happens when two Delivery Services with different
>> > "topology requirements" use the same origin, whatever that means
>> (content
>> > routing type?). Regardless, the issue should be addressed. The obvious
>> > solution is to put in place a database constraint that prevents an
>> origin
>> > from being assigned to more that one Delivery Service with API checks in
>> > place that would provide helpful error messages when an attempt is made
>> to
>> > violate the constraint. However, would that mess with things like
>> > Multi-Site Origin? Or is it just not viable for some other reason? If
>> it is
>> > a good solution, I'm prepared to work on a fix that utilizes it.
>> >         >
>> >         >
>> >
>> >
>> >
>> >
>> >
>>
>

Reply via email to