Re: Cache-Side Config Generation

Nir Sopher Wed, 31 Jul 2019 06:49:44 -0700

Hi,

Architecture wise, I'm in favor of the traffic ops sending the specific
configuration to the cache.
Main reason is taking features like "DS *individual *automatic deployment"
into account, where we would like to be able to control "which server get
which configuration and when" - e.g. edge cache "A" can be assigned with a
DS only after all its parents are aware of the DS.
I believe that if the control of "what configuration is pulled" is in the
hand of the cache, the complexity of the cfg distribution flow would
increase and debug-ability would be very difficult.


Nir




On Wed, Jul 31, 2019, 05:47 Robert Butts <r...@apache.org> wrote:

> >Sure, but I think that's missing the point a bit. There's still the extra
> step of fetching the configs from a local source, which is the redundancy
> that concerns me. Not in the short-term, but as a long-term solution.
>
> I'm not sure I understand the concern. The "extra step" is just asking a
> local app instead of HTTP. Are you concerned about performance? That should
> be negligible. Likewise, what's the difference in calling a Perl/Python
> function, and calling an app? Is there really much difference in two Python
> files, versus a Python file calling a binary file?
>
> >with the Go rewrite of Traffic Ops already more than two major versions
> and three years (I think?) old I'm dubious of adding another component
> that's supposed to "eventually" replace (and we're not even committed to
> that) another.
>
> I also share that concern. But IMO it would be better to have a local app
> generating a single config and proxying everything else, than to not have
> it. IMO the ability to canary-deploy even a single config cache-side is
> worth the app overhead. I'm also hopeful it will go quicker than TO --
> there's far less config code than the entirety of TO, and we've already
> written most of it, AFAIK there are only a few small config files left.
>
> > this adds the potential question "what if my config generator is version
> X and ORT is version Y?"
>
> Ahh, I think I wasn't clear about how this will be deployed. It's part of
> ORT-the-RPM. The binary app isn't part of ORT-the-script, but it's in the
> RPM, and installed/upgraded by Yum. See
>
> https://github.com/apache/trafficcontrol/pull/3762/files#diff-8ebb93342b2acfa55d6c9fc7df534518
> . So, it shouldn't ever be a different version than ORT-the-script, unless
> someone manually copies a different binary or script file in, which Traffic
> Control would not support any more than someone dropping a different
> traffic_ctl in an ATS install. Is that any better?
>
>
> On Tue, Jul 30, 2019 at 8:02 PM ocket8888 <ocket8...@gmail.com> wrote:
>
> >  >>  is there any reason we can't hit the DB from ORT
> >
> > pls no
> >
> >
> >  > ...the config generation in the ORT script itself, we would have to
> > write it all from scratch in Perl (the old config gen used the database
> > directly, it'd still have to be rewritten) or Python
> >
> > but what if it _was_ in Python though? Something for me to work on this
> > weekend, I suppose...
> >
> >  > That's exactly what it does: the PR changes ORT to call this app
> > instead of calling Traffic Ops over HTTP:
> >
> >
> > Sure, but I think that's missing the point a bit. There's still the
> > extra step of fetching the configs from a local source, which is the
> > redundancy that concerns me. Not in the short-term, but as a long-term
> > solution.
> >
> >
> >  >  I reserve the right to develop a strong opinion about that [whether
> > ORT is to exist forever in concert with configuration generation] in the
> > future.
> >
> > Of course you're entitled to that, but my concern is that we've
> > basically added a component here. ORT already servers the purpose of
> > creating on-disk  configuration files from data stored in Traffic Ops,
> > and this adds the potential question "what if my config generator is
> > version X and ORT is version Y?" and I just think we have enough of that
> > already. Sure, ORT does more than place the configuration file, but I'm
> > not sure that it does "much more". It emplaces a status file, manages
> > packages, and sets service status. Those are arguably complex, but much
> > less so when you consider that only CentOS 6/7 is supported. I'd
> > estimate a solid 80% of ORT is dealing with configuration files, and I
> > understand that that's a dangerously huge rewrite (although ORT.py may
> > have done quite a bit of that already!), but with the Go rewrite of
> > Traffic Ops already more than two major versions and three years (I
> > think?) old I'm dubious of adding another component that's supposed to
> > "eventually" replace (and we're not even committed to that) another.
> >
> > To be clear, though, I absolutely think that config generation on the
> > cache server is a massive step in the right direction, and for that
> > reason alone I wouldn't oppose this if it's what everyone else thinks is
> > best.
> >
> > On 7/30/19 6:06 PM, Robert Butts wrote:
> > >> is there any reason we can't hit the DB from ORT
> > > Technically, it's possible. But we really, really shouldn't. The API
> is a
> > > guaranteed interface. The database has no such guarantees. TC users
> would
> > > then be required to deploy ORT with TO, in order; or else implement
> some
> > > sort of backwards compatibility in the DB. In other words, we'd end up
> > > having to deal with all the Versioning stuff TO already does for us
> (and
> > > this is why it does it).
> > >
> > >> I'm still not convinced that it would be that hard to modify it to use
> > > json data instead sql queries
> > >
> > > I was hoping the same. I did exactly that, in the process described in
> > the
> > > spec (transliterate -> use objects -> use http), to be as safe as
> > possible.
> > > It was more code than I'd hoped. I'd estimate the changes to the logic
> to
> > > use the objects, and then the code to create those objects from the
> API,
> > > I'd estimate at 20-30% of the entire config code.
> > >
> > >> What I AM nervous about is someone rewriting all that code
> > > I agree, there's some inevitable risk involved. FWIW I've gone to great
> > > lengths to minimize the risk as much as possible -- see the spec,
> > > transliterating as closely as possible, then changing as little as
> > > possible. I also have a set of scripts (which I'm happy to share with
> > > anyone who wants them) to pull and diff every single config file, on
> > every
> > > single server, edge and mid, for Profile endpoints every single
> profile,
> > > from our production database. I've done that for every single config
> file
> > > we've rewritten, to ensure parity as much as possible.
> > >
> > > Also FWIW, the very act of putting config gen in ORT means we can
> canary
> > > test one cache at a time, when deploying changes to prod, to ensure
> > correct
> > > behavior before deploying everywhere.
> > >
> > > I'm also hopeful this will make the config files more stable, once it's
> > > done. The Go essentially checks every single error condition it can
> > > conceive of. Of course, it isn't possible to check every dynamic
> > Parameter.
> > > But it comes pretty close. Where, I can speak from experience, the Perl
> > > checks pretty darn close to nothing for errors. The Go language lends
> > > itself to this, you typically have to go out of your way to ignore
> > errors;
> > > where Perl arguably lends itself to ignoring errors and assuming
> > errorable
> > > calls worked.
> > >
> > >
> > > On Tue, Jul 30, 2019 at 5:37 PM Derek Gelinas <mrdgeli...@gmail.com>
> > wrote:
> > >
> > >> This is probably a stupid question, but is there any reason we can't
> hit
> > >> the DB from ORT, thus saving us the expense of writing any new
> > scripting?
> > >> My understanding is that the biggest hit on traffic ops isn't the DB
> so
> > >> much as the perl processing for thousands of hosts at once.  I assume
> > that
> > >> the DB requests themselves would  fairly cacheable, no?
> > >>
> > >> To be honest I'm still not convinced that it would be that hard to
> > modify
> > >> it to use json data instead sql queries.  What I AM nervous about is
> > >> someone rewriting all that code.  It's pretty damn particular and
> there
> > >> have been a few times where much more minor things have been rewritten
> > that
> > >> missed the point of certain results entirely and as such broke things.
> > >>
> > >> Derek
> > >>
> > >>> On Jul 30, 2019, at 6:22 PM, Robert Butts <r...@apache.org> wrote:
> > >>>
> > >>>> I'm confused why this is separate from ORT.
> > >>> Because ORT does a lot more than just fetching config files.
> Rewriting
> > >> all
> > >>> of ORT in Go would be considerably more work. Contrawise, if we were
> to
> > >> put
> > >>> the config generation in the ORT script itself, we would have to
> write
> > it
> > >>> all from scratch in Perl (the old config gen used the database
> > directly,
> > >>> it'd still have to be rewritten) or Python. This was just the easiest
> > >> path
> > >>> forward.
> > >>>
> > >>>> I feel like this logic should just be replacing the config fetching
> > >> logic
> > >>> of ORT
> > >>>
> > >>> That's exactly what it does: the PR changes ORT to call this app
> > instead
> > >> of
> > >>> calling Traffic Ops over HTTP:
> > >>>
> > >>
> >
> https://github.com/apache/trafficcontrol/pull/3762/files#diff-fe8a3eac71ee592a7170f2bdc7e65624R1485
> > >>>> Is that the eventual plan? Or does our vision of the future include
> > this
> > >>> *and* ORT?
> > >>>
> > >>> I reserve the right to develop a strong opinion about that in the
> > future.
> > >>>
> > >>>
> > >>> On Tue, Jul 30, 2019 at 3:17 PM ocket8888 <ocket8...@gmail.com>
> wrote:
> > >>>
> > >>>>> "I'm just looking for consensus that this is the right approach."
> > >>>> Umm... sort of. I think moving cache configuration to the cache
> itself
> > >>>> is a great idea,
> > >>>>
> > >>>> but I'm confused why this is separate from ORT. Like if this is
> going
> > to
> > >>>> be generating the
> > >>>>
> > >>>> configs and it's already right there on the server, I feel like this
> > >>>> logic should just be
> > >>>>
> > >>>> replacing the config fetching logic of ORT (and personally I think a
> > >>>> neat place to try it
> > >>>>
> > >>>> out would be in ORT.py).
> > >>>>
> > >>>>
> > >>>> Is that the eventual plan? Or does our vision of the future include
> > this
> > >>>> *and* ORT?
> > >>>>
> > >>>>
> > >>>> On 7/30/19 2:15 PM, Robert Butts wrote:
> > >>>>> Hi all! I've been working on moving the ATS config generation from
> > >>>> Traffic
> > >>>>> Ops to a standalone app alongside ORT, that queries the standard TO
> > API
> > >>>> to
> > >>>>> generate its data. I just wanted to put it here, and get some
> > feedback,
> > >>>> to
> > >>>>> make sure the community agrees this is the right direction.
> > >>>>>
> > >>>>> There's a (very) brief spec here: (I might put more detail into it
> > >> later,
> > >>>>> let me know if that's important to anyone)
> > >>>>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/TC/Cache-Side+Config+Generation
> > >>>>> And the Draft PR is here:
> > >>>>> https://github.com/apache/trafficcontrol/pull/3762
> > >>>>>
> > >>>>> This has a number of advantages:
> > >>>>> 1. TO is a monolith, this moves a significant amount of logic out
> of
> > >> it,
> > >>>>> into a smaller per-cache app/library that's easier to test,
> validate,
> > >>>>> rewrite, deploy, canary, rollback, etc.
> > >>>>> 2. Deploying cache config changes is much smaller and safer.
> Instead
> > of
> > >>>>> having to deploy (and potentially roll back) TO, you can canary
> > deploy
> > >> on
> > >>>>> one cache at a time.
> > >>>>> 3. This makes TC more cache-agnostic. It moves cache config
> > generation
> > >>>>> logic out of TO, and into an independent app/library. The app
> > >> (atstccfg)
> > >>>> is
> > >>>>> actually very similar to Grove's config generator (grovetccfg).
> This
> > >>>> makes
> > >>>>> it easier and more obvious how to write config generators for other
> > >>>> proxies.
> > >>>>> 4. By using the API and putting the generator functions in a
> library,
> > >>>> this
> > >>>>> really gives a lot more flexibility to put the config gen anywhere
> > you
> > >>>> want
> > >>>>> without too much work. You could easily put it in an HTTP service,
> or
> > >>>> even
> > >>>>> put it back in TO via a Plugin. That's not something that's really
> > >>>> possible
> > >>>>> with the existing system, generating directly from the database.
> > >>>>>
> > >>>>> Right now, I'm just looking for consensus that this is the right
> > >>>> approach.
> > >>>>> Does the community agree this is the right direction? Are there
> > >> concerns?
> > >>>>> Would anyone like more details about anything in particular?
> > >>>>>
> > >>>>> Thanks,
> > >>>>>
> > >>
> >
>

Re: Cache-Side Config Generation

Reply via email to