Hi, Architecture wise, I'm in favor of the traffic ops sending the specific configuration to the cache. Main reason is taking features like "DS *individual *automatic deployment" into account, where we would like to be able to control "which server get which configuration and when" - e.g. edge cache "A" can be assigned with a DS only after all its parents are aware of the DS. I believe that if the control of "what configuration is pulled" is in the hand of the cache, the complexity of the cfg distribution flow would increase and debug-ability would be very difficult.
Nir On Wed, Jul 31, 2019, 05:47 Robert Butts <r...@apache.org> wrote: > >Sure, but I think that's missing the point a bit. There's still the extra > step of fetching the configs from a local source, which is the redundancy > that concerns me. Not in the short-term, but as a long-term solution. > > I'm not sure I understand the concern. The "extra step" is just asking a > local app instead of HTTP. Are you concerned about performance? That should > be negligible. Likewise, what's the difference in calling a Perl/Python > function, and calling an app? Is there really much difference in two Python > files, versus a Python file calling a binary file? > > >with the Go rewrite of Traffic Ops already more than two major versions > and three years (I think?) old I'm dubious of adding another component > that's supposed to "eventually" replace (and we're not even committed to > that) another. > > I also share that concern. But IMO it would be better to have a local app > generating a single config and proxying everything else, than to not have > it. IMO the ability to canary-deploy even a single config cache-side is > worth the app overhead. I'm also hopeful it will go quicker than TO -- > there's far less config code than the entirety of TO, and we've already > written most of it, AFAIK there are only a few small config files left. > > > this adds the potential question "what if my config generator is version > X and ORT is version Y?" > > Ahh, I think I wasn't clear about how this will be deployed. It's part of > ORT-the-RPM. The binary app isn't part of ORT-the-script, but it's in the > RPM, and installed/upgraded by Yum. See > > https://github.com/apache/trafficcontrol/pull/3762/files#diff-8ebb93342b2acfa55d6c9fc7df534518 > . So, it shouldn't ever be a different version than ORT-the-script, unless > someone manually copies a different binary or script file in, which Traffic > Control would not support any more than someone dropping a different > traffic_ctl in an ATS install. Is that any better? > > > On Tue, Jul 30, 2019 at 8:02 PM ocket8888 <ocket8...@gmail.com> wrote: > > > >> is there any reason we can't hit the DB from ORT > > > > pls no > > > > > > > ...the config generation in the ORT script itself, we would have to > > write it all from scratch in Perl (the old config gen used the database > > directly, it'd still have to be rewritten) or Python > > > > but what if it _was_ in Python though? Something for me to work on this > > weekend, I suppose... > > > > > That's exactly what it does: the PR changes ORT to call this app > > instead of calling Traffic Ops over HTTP: > > > > > > Sure, but I think that's missing the point a bit. There's still the > > extra step of fetching the configs from a local source, which is the > > redundancy that concerns me. Not in the short-term, but as a long-term > > solution. > > > > > > > I reserve the right to develop a strong opinion about that [whether > > ORT is to exist forever in concert with configuration generation] in the > > future. > > > > Of course you're entitled to that, but my concern is that we've > > basically added a component here. ORT already servers the purpose of > > creating on-disk configuration files from data stored in Traffic Ops, > > and this adds the potential question "what if my config generator is > > version X and ORT is version Y?" and I just think we have enough of that > > already. Sure, ORT does more than place the configuration file, but I'm > > not sure that it does "much more". It emplaces a status file, manages > > packages, and sets service status. Those are arguably complex, but much > > less so when you consider that only CentOS 6/7 is supported. I'd > > estimate a solid 80% of ORT is dealing with configuration files, and I > > understand that that's a dangerously huge rewrite (although ORT.py may > > have done quite a bit of that already!), but with the Go rewrite of > > Traffic Ops already more than two major versions and three years (I > > think?) old I'm dubious of adding another component that's supposed to > > "eventually" replace (and we're not even committed to that) another. > > > > To be clear, though, I absolutely think that config generation on the > > cache server is a massive step in the right direction, and for that > > reason alone I wouldn't oppose this if it's what everyone else thinks is > > best. > > > > On 7/30/19 6:06 PM, Robert Butts wrote: > > >> is there any reason we can't hit the DB from ORT > > > Technically, it's possible. But we really, really shouldn't. The API > is a > > > guaranteed interface. The database has no such guarantees. TC users > would > > > then be required to deploy ORT with TO, in order; or else implement > some > > > sort of backwards compatibility in the DB. In other words, we'd end up > > > having to deal with all the Versioning stuff TO already does for us > (and > > > this is why it does it). > > > > > >> I'm still not convinced that it would be that hard to modify it to use > > > json data instead sql queries > > > > > > I was hoping the same. I did exactly that, in the process described in > > the > > > spec (transliterate -> use objects -> use http), to be as safe as > > possible. > > > It was more code than I'd hoped. I'd estimate the changes to the logic > to > > > use the objects, and then the code to create those objects from the > API, > > > I'd estimate at 20-30% of the entire config code. > > > > > >> What I AM nervous about is someone rewriting all that code > > > I agree, there's some inevitable risk involved. FWIW I've gone to great > > > lengths to minimize the risk as much as possible -- see the spec, > > > transliterating as closely as possible, then changing as little as > > > possible. I also have a set of scripts (which I'm happy to share with > > > anyone who wants them) to pull and diff every single config file, on > > every > > > single server, edge and mid, for Profile endpoints every single > profile, > > > from our production database. I've done that for every single config > file > > > we've rewritten, to ensure parity as much as possible. > > > > > > Also FWIW, the very act of putting config gen in ORT means we can > canary > > > test one cache at a time, when deploying changes to prod, to ensure > > correct > > > behavior before deploying everywhere. > > > > > > I'm also hopeful this will make the config files more stable, once it's > > > done. The Go essentially checks every single error condition it can > > > conceive of. Of course, it isn't possible to check every dynamic > > Parameter. > > > But it comes pretty close. Where, I can speak from experience, the Perl > > > checks pretty darn close to nothing for errors. The Go language lends > > > itself to this, you typically have to go out of your way to ignore > > errors; > > > where Perl arguably lends itself to ignoring errors and assuming > > errorable > > > calls worked. > > > > > > > > > On Tue, Jul 30, 2019 at 5:37 PM Derek Gelinas <mrdgeli...@gmail.com> > > wrote: > > > > > >> This is probably a stupid question, but is there any reason we can't > hit > > >> the DB from ORT, thus saving us the expense of writing any new > > scripting? > > >> My understanding is that the biggest hit on traffic ops isn't the DB > so > > >> much as the perl processing for thousands of hosts at once. I assume > > that > > >> the DB requests themselves would fairly cacheable, no? > > >> > > >> To be honest I'm still not convinced that it would be that hard to > > modify > > >> it to use json data instead sql queries. What I AM nervous about is > > >> someone rewriting all that code. It's pretty damn particular and > there > > >> have been a few times where much more minor things have been rewritten > > that > > >> missed the point of certain results entirely and as such broke things. > > >> > > >> Derek > > >> > > >>> On Jul 30, 2019, at 6:22 PM, Robert Butts <r...@apache.org> wrote: > > >>> > > >>>> I'm confused why this is separate from ORT. > > >>> Because ORT does a lot more than just fetching config files. > Rewriting > > >> all > > >>> of ORT in Go would be considerably more work. Contrawise, if we were > to > > >> put > > >>> the config generation in the ORT script itself, we would have to > write > > it > > >>> all from scratch in Perl (the old config gen used the database > > directly, > > >>> it'd still have to be rewritten) or Python. This was just the easiest > > >> path > > >>> forward. > > >>> > > >>>> I feel like this logic should just be replacing the config fetching > > >> logic > > >>> of ORT > > >>> > > >>> That's exactly what it does: the PR changes ORT to call this app > > instead > > >> of > > >>> calling Traffic Ops over HTTP: > > >>> > > >> > > > https://github.com/apache/trafficcontrol/pull/3762/files#diff-fe8a3eac71ee592a7170f2bdc7e65624R1485 > > >>>> Is that the eventual plan? Or does our vision of the future include > > this > > >>> *and* ORT? > > >>> > > >>> I reserve the right to develop a strong opinion about that in the > > future. > > >>> > > >>> > > >>> On Tue, Jul 30, 2019 at 3:17 PM ocket8888 <ocket8...@gmail.com> > wrote: > > >>> > > >>>>> "I'm just looking for consensus that this is the right approach." > > >>>> Umm... sort of. I think moving cache configuration to the cache > itself > > >>>> is a great idea, > > >>>> > > >>>> but I'm confused why this is separate from ORT. Like if this is > going > > to > > >>>> be generating the > > >>>> > > >>>> configs and it's already right there on the server, I feel like this > > >>>> logic should just be > > >>>> > > >>>> replacing the config fetching logic of ORT (and personally I think a > > >>>> neat place to try it > > >>>> > > >>>> out would be in ORT.py). > > >>>> > > >>>> > > >>>> Is that the eventual plan? Or does our vision of the future include > > this > > >>>> *and* ORT? > > >>>> > > >>>> > > >>>> On 7/30/19 2:15 PM, Robert Butts wrote: > > >>>>> Hi all! I've been working on moving the ATS config generation from > > >>>> Traffic > > >>>>> Ops to a standalone app alongside ORT, that queries the standard TO > > API > > >>>> to > > >>>>> generate its data. I just wanted to put it here, and get some > > feedback, > > >>>> to > > >>>>> make sure the community agrees this is the right direction. > > >>>>> > > >>>>> There's a (very) brief spec here: (I might put more detail into it > > >> later, > > >>>>> let me know if that's important to anyone) > > >>>>> > > >> > > > https://cwiki.apache.org/confluence/display/TC/Cache-Side+Config+Generation > > >>>>> And the Draft PR is here: > > >>>>> https://github.com/apache/trafficcontrol/pull/3762 > > >>>>> > > >>>>> This has a number of advantages: > > >>>>> 1. TO is a monolith, this moves a significant amount of logic out > of > > >> it, > > >>>>> into a smaller per-cache app/library that's easier to test, > validate, > > >>>>> rewrite, deploy, canary, rollback, etc. > > >>>>> 2. Deploying cache config changes is much smaller and safer. > Instead > > of > > >>>>> having to deploy (and potentially roll back) TO, you can canary > > deploy > > >> on > > >>>>> one cache at a time. > > >>>>> 3. This makes TC more cache-agnostic. It moves cache config > > generation > > >>>>> logic out of TO, and into an independent app/library. The app > > >> (atstccfg) > > >>>> is > > >>>>> actually very similar to Grove's config generator (grovetccfg). > This > > >>>> makes > > >>>>> it easier and more obvious how to write config generators for other > > >>>> proxies. > > >>>>> 4. By using the API and putting the generator functions in a > library, > > >>>> this > > >>>>> really gives a lot more flexibility to put the config gen anywhere > > you > > >>>> want > > >>>>> without too much work. You could easily put it in an HTTP service, > or > > >>>> even > > >>>>> put it back in TO via a Plugin. That's not something that's really > > >>>> possible > > >>>>> with the existing system, generating directly from the database. > > >>>>> > > >>>>> Right now, I'm just looking for consensus that this is the right > > >>>> approach. > > >>>>> Does the community agree this is the right direction? Are there > > >> concerns? > > >>>>> Would anyone like more details about anything in particular? > > >>>>> > > >>>>> Thanks, > > >>>>> > > >> > > >