>> is there any reason we can't hit the DB from ORT
pls no
> ...the config generation in the ORT script itself, we would have to
write it all from scratch in Perl (the old config gen used the database
directly, it'd still have to be rewritten) or Python
but what if it _was_ in Python though? Something for me to work on this
weekend, I suppose...
> That's exactly what it does: the PR changes ORT to call this app
instead of calling Traffic Ops over HTTP:
Sure, but I think that's missing the point a bit. There's still the
extra step of fetching the configs from a local source, which is the
redundancy that concerns me. Not in the short-term, but as a long-term
solution.
> I reserve the right to develop a strong opinion about that [whether
ORT is to exist forever in concert with configuration generation] in the
future.
Of course you're entitled to that, but my concern is that we've
basically added a component here. ORT already servers the purpose of
creating on-disk configuration files from data stored in Traffic Ops,
and this adds the potential question "what if my config generator is
version X and ORT is version Y?" and I just think we have enough of that
already. Sure, ORT does more than place the configuration file, but I'm
not sure that it does "much more". It emplaces a status file, manages
packages, and sets service status. Those are arguably complex, but much
less so when you consider that only CentOS 6/7 is supported. I'd
estimate a solid 80% of ORT is dealing with configuration files, and I
understand that that's a dangerously huge rewrite (although ORT.py may
have done quite a bit of that already!), but with the Go rewrite of
Traffic Ops already more than two major versions and three years (I
think?) old I'm dubious of adding another component that's supposed to
"eventually" replace (and we're not even committed to that) another.
To be clear, though, I absolutely think that config generation on the
cache server is a massive step in the right direction, and for that
reason alone I wouldn't oppose this if it's what everyone else thinks is
best.
On 7/30/19 6:06 PM, Robert Butts wrote:
is there any reason we can't hit the DB from ORT
Technically, it's possible. But we really, really shouldn't. The API
is a
guaranteed interface. The database has no such guarantees. TC users
would
then be required to deploy ORT with TO, in order; or else implement some
sort of backwards compatibility in the DB. In other words, we'd end up
having to deal with all the Versioning stuff TO already does for us (and
this is why it does it).
I'm still not convinced that it would be that hard to modify it to use
json data instead sql queries
I was hoping the same. I did exactly that, in the process described in
the
spec (transliterate -> use objects -> use http), to be as safe as
possible.
It was more code than I'd hoped. I'd estimate the changes to the
logic to
use the objects, and then the code to create those objects from the API,
I'd estimate at 20-30% of the entire config code.
What I AM nervous about is someone rewriting all that code
I agree, there's some inevitable risk involved. FWIW I've gone to great
lengths to minimize the risk as much as possible -- see the spec,
transliterating as closely as possible, then changing as little as
possible. I also have a set of scripts (which I'm happy to share with
anyone who wants them) to pull and diff every single config file, on
every
single server, edge and mid, for Profile endpoints every single profile,
from our production database. I've done that for every single config
file
we've rewritten, to ensure parity as much as possible.
Also FWIW, the very act of putting config gen in ORT means we can canary
test one cache at a time, when deploying changes to prod, to ensure
correct
behavior before deploying everywhere.
I'm also hopeful this will make the config files more stable, once it's
done. The Go essentially checks every single error condition it can
conceive of. Of course, it isn't possible to check every dynamic
Parameter.
But it comes pretty close. Where, I can speak from experience, the Perl
checks pretty darn close to nothing for errors. The Go language lends
itself to this, you typically have to go out of your way to ignore
errors;
where Perl arguably lends itself to ignoring errors and assuming
errorable
calls worked.
On Tue, Jul 30, 2019 at 5:37 PM Derek Gelinas <[email protected]>
wrote:
This is probably a stupid question, but is there any reason we
can't hit
the DB from ORT, thus saving us the expense of writing any new
scripting?
My understanding is that the biggest hit on traffic ops isn't the DB so
much as the perl processing for thousands of hosts at once. I assume
that
the DB requests themselves would fairly cacheable, no?
To be honest I'm still not convinced that it would be that hard to
modify
it to use json data instead sql queries. What I AM nervous about is
someone rewriting all that code. It's pretty damn particular and there
have been a few times where much more minor things have been rewritten
that
missed the point of certain results entirely and as such broke things.
Derek
On Jul 30, 2019, at 6:22 PM, Robert Butts <[email protected]> wrote:
I'm confused why this is separate from ORT.
Because ORT does a lot more than just fetching config files. Rewriting
all
of ORT in Go would be considerably more work. Contrawise, if we
were to
put
the config generation in the ORT script itself, we would have to write
it
all from scratch in Perl (the old config gen used the database
directly,
it'd still have to be rewritten) or Python. This was just the easiest
path
forward.
I feel like this logic should just be replacing the config fetching
logic
of ORT
That's exactly what it does: the PR changes ORT to call this app
instead
of
calling Traffic Ops over HTTP:
https://github.com/apache/trafficcontrol/pull/3762/files#diff-fe8a3eac71ee592a7170f2bdc7e65624R1485
Is that the eventual plan? Or does our vision of the future include
this
*and* ORT?
I reserve the right to develop a strong opinion about that in the
future.
On Tue, Jul 30, 2019 at 3:17 PM ocket8888 <[email protected]> wrote:
"I'm just looking for consensus that this is the right approach."
Umm... sort of. I think moving cache configuration to the cache
itself
is a great idea,
but I'm confused why this is separate from ORT. Like if this is going
to
be generating the
configs and it's already right there on the server, I feel like this
logic should just be
replacing the config fetching logic of ORT (and personally I think a
neat place to try it
out would be in ORT.py).
Is that the eventual plan? Or does our vision of the future include
this
*and* ORT?
On 7/30/19 2:15 PM, Robert Butts wrote:
Hi all! I've been working on moving the ATS config generation from
Traffic
Ops to a standalone app alongside ORT, that queries the standard TO
API
to
generate its data. I just wanted to put it here, and get some
feedback,
to
make sure the community agrees this is the right direction.
There's a (very) brief spec here: (I might put more detail into it
later,
let me know if that's important to anyone)
https://cwiki.apache.org/confluence/display/TC/Cache-Side+Config+Generation
And the Draft PR is here:
https://github.com/apache/trafficcontrol/pull/3762
This has a number of advantages:
1. TO is a monolith, this moves a significant amount of logic out of
it,
into a smaller per-cache app/library that's easier to test,
validate,
rewrite, deploy, canary, rollback, etc.
2. Deploying cache config changes is much smaller and safer. Instead
of
having to deploy (and potentially roll back) TO, you can canary
deploy
on
one cache at a time.
3. This makes TC more cache-agnostic. It moves cache config
generation
logic out of TO, and into an independent app/library. The app
(atstccfg)
is
actually very similar to Grove's config generator (grovetccfg). This
makes
it easier and more obvious how to write config generators for other
proxies.
4. By using the API and putting the generator functions in a
library,
this
really gives a lot more flexibility to put the config gen anywhere
you
want
without too much work. You could easily put it in an HTTP
service, or
even
put it back in TO via a Plugin. That's not something that's really
possible
with the existing system, generating directly from the database.
Right now, I'm just looking for consensus that this is the right
approach.
Does the community agree this is the right direction? Are there
concerns?
Would anyone like more details about anything in particular?
Thanks,