I've accidentally been replying to Rob directly :P

-------- Forwarded Message --------
Subject:        Re: Cache-Side Config Generation
Date:   Wed, 31 Jul 2019 09:56:37 -0600
From:   ocket8888 <[email protected]>
To:     Robert Butts <[email protected]>



The "extra step" is just asking a local app instead of HTTP. Are you
concerned about performance? That should be negligible. Likewise, what's the difference in calling a Perl/Python function, and calling an app? Is there really much difference in two Python files, versus a Python file calling a binary file?

Well... when you import a Python module, its namespace is immediately brought into the execution context. On this first run, that means compiling it first, but after that Python will check a hash of the source file against one stored in a .pyc (or .pyo, depending on runtime flags to the interpreter) to see if it needs to again - and as of 3.7 it will instead use modification times like `make` does. So the difference is calling out to `fork` and `exec*`, which involves two more context switches than just opening and reading a file.

But none of that matters because: no, I'm not concerned about performance. Depending on how the data is going to be parsed once ORT gets it back, you probably even make up for the context switches with the speed of native execution. I just like talking about Python.

The "redundancy" I was talking about was maintaining the config generation code in two places. Obviously we'd be doing that anyway if it was built into ORT, but we'd be doing it much longer if the plan was to eventually build it into ORT. That's why I was talking about this as an "extra step".

It's part of ORT-the-RPM... Is that any better?

Yes, much.

On 7/30/19 8:46 PM, Robert Butts wrote:
Sure, but I think that's missing the point a bit. There's still the extra
step of fetching the configs from a local source, which is the redundancy
that concerns me. Not in the short-term, but as a long-term solution.

I'm not sure I understand the concern. The "extra step" is just asking a
local app instead of HTTP. Are you concerned about performance? That should
be negligible. Likewise, what's the difference in calling a Perl/Python
function, and calling an app? Is there really much difference in two Python
files, versus a Python file calling a binary file?

with the Go rewrite of Traffic Ops already more than two major versions
and three years (I think?) old I'm dubious of adding another component
that's supposed to "eventually" replace (and we're not even committed to
that) another.

I also share that concern. But IMO it would be better to have a local app
generating a single config and proxying everything else, than to not have
it. IMO the ability to canary-deploy even a single config cache-side is
worth the app overhead. I'm also hopeful it will go quicker than TO --
there's far less config code than the entirety of TO, and we've already
written most of it, AFAIK there are only a few small config files left.

this adds the potential question "what if my config generator is version
X and ORT is version Y?"

Ahh, I think I wasn't clear about how this will be deployed. It's part of
ORT-the-RPM. The binary app isn't part of ORT-the-script, but it's in the
RPM, and installed/upgraded by Yum. See
https://github.com/apache/trafficcontrol/pull/3762/files#diff-8ebb93342b2acfa55d6c9fc7df534518
. So, it shouldn't ever be a different version than ORT-the-script, unless
someone manually copies a different binary or script file in, which Traffic
Control would not support any more than someone dropping a different
traffic_ctl in an ATS install. Is that any better?


On Tue, Jul 30, 2019 at 8:02 PM ocket8888 <[email protected]> wrote:

>> is there any reason we can't hit the DB from ORT

pls no


> ...the config generation in the ORT script itself, we would have to
write it all from scratch in Perl (the old config gen used the database
directly, it'd still have to be rewritten) or Python

but what if it _was_ in Python though? Something for me to work on this
weekend, I suppose...

> That's exactly what it does: the PR changes ORT to call this app
instead of calling Traffic Ops over HTTP:


Sure, but I think that's missing the point a bit. There's still the
extra step of fetching the configs from a local source, which is the
redundancy that concerns me. Not in the short-term, but as a long-term
solution.


> I reserve the right to develop a strong opinion about that [whether
ORT is to exist forever in concert with configuration generation] in the
future.

Of course you're entitled to that, but my concern is that we've
basically added a component here. ORT already servers the purpose of
creating on-disk configuration files from data stored in Traffic Ops,
and this adds the potential question "what if my config generator is
version X and ORT is version Y?" and I just think we have enough of that
already. Sure, ORT does more than place the configuration file, but I'm
not sure that it does "much more". It emplaces a status file, manages
packages, and sets service status. Those are arguably complex, but much
less so when you consider that only CentOS 6/7 is supported. I'd
estimate a solid 80% of ORT is dealing with configuration files, and I
understand that that's a dangerously huge rewrite (although ORT.py may
have done quite a bit of that already!), but with the Go rewrite of
Traffic Ops already more than two major versions and three years (I
think?) old I'm dubious of adding another component that's supposed to
"eventually" replace (and we're not even committed to that) another.

To be clear, though, I absolutely think that config generation on the
cache server is a massive step in the right direction, and for that
reason alone I wouldn't oppose this if it's what everyone else thinks is
best.

On 7/30/19 6:06 PM, Robert Butts wrote:
is there any reason we can't hit the DB from ORT
Technically, it's possible. But we really, really shouldn't. The API is a guaranteed interface. The database has no such guarantees. TC users would
then be required to deploy ORT with TO, in order; or else implement some
sort of backwards compatibility in the DB. In other words, we'd end up
having to deal with all the Versioning stuff TO already does for us (and
this is why it does it).

I'm still not convinced that it would be that hard to modify it to use
json data instead sql queries

I was hoping the same. I did exactly that, in the process described in
the
spec (transliterate -> use objects -> use http), to be as safe as
possible.
It was more code than I'd hoped. I'd estimate the changes to the logic to
use the objects, and then the code to create those objects from the API,
I'd estimate at 20-30% of the entire config code.

What I AM nervous about is someone rewriting all that code
I agree, there's some inevitable risk involved. FWIW I've gone to great
lengths to minimize the risk as much as possible -- see the spec,
transliterating as closely as possible, then changing as little as
possible. I also have a set of scripts (which I'm happy to share with
anyone who wants them) to pull and diff every single config file, on
every
single server, edge and mid, for Profile endpoints every single profile,
from our production database. I've done that for every single config file
we've rewritten, to ensure parity as much as possible.

Also FWIW, the very act of putting config gen in ORT means we can canary
test one cache at a time, when deploying changes to prod, to ensure
correct
behavior before deploying everywhere.

I'm also hopeful this will make the config files more stable, once it's
done. The Go essentially checks every single error condition it can
conceive of. Of course, it isn't possible to check every dynamic
Parameter.
But it comes pretty close. Where, I can speak from experience, the Perl
checks pretty darn close to nothing for errors. The Go language lends
itself to this, you typically have to go out of your way to ignore
errors;
where Perl arguably lends itself to ignoring errors and assuming
errorable
calls worked.


On Tue, Jul 30, 2019 at 5:37 PM Derek Gelinas <[email protected]>
wrote:
This is probably a stupid question, but is there any reason we can't hit
the DB from ORT, thus saving us the expense of writing any new
scripting?
My understanding is that the biggest hit on traffic ops isn't the DB so
much as the perl processing for thousands of hosts at once. I assume
that
the DB requests themselves would fairly cacheable, no?

To be honest I'm still not convinced that it would be that hard to
modify
it to use json data instead sql queries. What I AM nervous about is
someone rewriting all that code. It's pretty damn particular and there
have been a few times where much more minor things have been rewritten
that
missed the point of certain results entirely and as such broke things.

Derek

On Jul 30, 2019, at 6:22 PM, Robert Butts <[email protected]> wrote:

I'm confused why this is separate from ORT.
Because ORT does a lot more than just fetching config files. Rewriting
all
of ORT in Go would be considerably more work. Contrawise, if we were to
put
the config generation in the ORT script itself, we would have to write
it
all from scratch in Perl (the old config gen used the database
directly,
it'd still have to be rewritten) or Python. This was just the easiest
path
forward.

I feel like this logic should just be replacing the config fetching
logic
of ORT

That's exactly what it does: the PR changes ORT to call this app
instead
of
calling Traffic Ops over HTTP:

https://github.com/apache/trafficcontrol/pull/3762/files#diff-fe8a3eac71ee592a7170f2bdc7e65624R1485
Is that the eventual plan? Or does our vision of the future include
this
*and* ORT?

I reserve the right to develop a strong opinion about that in the
future.

On Tue, Jul 30, 2019 at 3:17 PM ocket8888 <[email protected]> wrote:

"I'm just looking for consensus that this is the right approach."
Umm... sort of. I think moving cache configuration to the cache itself
is a great idea,

but I'm confused why this is separate from ORT. Like if this is going
to
be generating the

configs and it's already right there on the server, I feel like this
logic should just be

replacing the config fetching logic of ORT (and personally I think a
neat place to try it

out would be in ORT.py).


Is that the eventual plan? Or does our vision of the future include
this
*and* ORT?


On 7/30/19 2:15 PM, Robert Butts wrote:
Hi all! I've been working on moving the ATS config generation from
Traffic
Ops to a standalone app alongside ORT, that queries the standard TO
API
to
generate its data. I just wanted to put it here, and get some
feedback,
to
make sure the community agrees this is the right direction.

There's a (very) brief spec here: (I might put more detail into it
later,
let me know if that's important to anyone)

https://cwiki.apache.org/confluence/display/TC/Cache-Side+Config+Generation
And the Draft PR is here:
https://github.com/apache/trafficcontrol/pull/3762

This has a number of advantages:
1. TO is a monolith, this moves a significant amount of logic out of
it,
into a smaller per-cache app/library that's easier to test, validate,
rewrite, deploy, canary, rollback, etc.
2. Deploying cache config changes is much smaller and safer. Instead
of
having to deploy (and potentially roll back) TO, you can canary
deploy
on
one cache at a time.
3. This makes TC more cache-agnostic. It moves cache config
generation
logic out of TO, and into an independent app/library. The app
(atstccfg)
is
actually very similar to Grove's config generator (grovetccfg). This
makes
it easier and more obvious how to write config generators for other
proxies.
4. By using the API and putting the generator functions in a library,
this
really gives a lot more flexibility to put the config gen anywhere
you
want
without too much work. You could easily put it in an HTTP service, or
even
put it back in TO via a Plugin. That's not something that's really
possible
with the existing system, generating directly from the database.

Right now, I'm just looking for consensus that this is the right
approach.
Does the community agree this is the right direction? Are there
concerns?
Would anyone like more details about anything in particular?

Thanks,

Reply via email to