Good summary Jeremy. I agree with Rawlin, I think it is reasonable to allow jobs to be changed up until they are active (using PUT) and also allow them to be DELETED at any time.
On Thu, Aug 1, 2019 at 9:28 AM Rawlin Peters <[email protected]> wrote: > I think you summed that up pretty well, Jeremy. @ocket8888 did bring > up a good point about the fact that you can submit a job without it > becoming active right away, so in theory you could be able to update a > revalidation before it actually becomes active. Maybe we should allow > PUT only when the job is "active", but you can DELETE a job at any > time. I do like the idea of the UI warning about deleting a job that > has already been activated, but the PUT of an "active" job should be > prohibited by the API _and_ UI IMO. > > - Rawlin > > On Thu, Aug 1, 2019 at 8:20 AM Jeremy Mitchell <[email protected]> > wrote: > > > > My understanding (and someone better versed in ATS please correct me if > i'm > > wrong) is that when you create a "invalidate/revalidate job" for a > delivery > > service, the following things happen: > > > > 1. the job is inserted into the job table. duh. > > 2. the reval_pending flag on ALL servers that belong to the delivery > > service's CDN is set to true (seems like overkill tbh when a delivery > > service may only be assigned to a subset of a cdn's servers but that's > > another discussion) > > 3. every minute, a cache will check if their reval_pending flag = true, > if > > so that cache will pull a new regex_revalidate.config file that will > > contain all the jobs for the cache's cdn where TTL < now > > > > now a new "rule" exists in the regex_revalidate.config to represent that > > new job: > > > > http://my.origin.com/foo.png 1567346310 <-- september 1 (one month from > now) > > > > when a request comes in to the cache for foo.png, ATS consults > > regex_revalidate.config and notices the rule and therefore, revalidates > the > > content (ignores what's in cache and goes back upstream). This is the > only > > time ATS will do this. It will only exercise this rule ONCE. foo.png is > now > > cacheable again going forward. > > > > Now imagine this delivery services is assigned to 50 caches across the > > country and this is a very active delivery service. Within 10 minutes, a > > request for foo.png has come in to each of the 50 caches and the new > > regex_revalidate rule has been exercised on each cache. So basically that > > rule is "done". it has done the job it was intended to do. > Editing/deleting > > this job will not change what's already been done. > > > > However, because of the TTL that was set on the job, the following rule > > will remain in regex_revalidate.config for a month > > > > http://my.origin.com/foo.png 1567346310 <-- september 1 (one month from > now) > > > > and ATS still needs to consult the rule to determine if it has been > > exercised or not. So there is some processing that needs to be done even > on > > a rule that is already done. I think I heard that when regex_revalidate > > gets really long, it can cause performance issues. > > > > Long story short. Does providing edit/delete of a job potentially provide > > false hope to the user? But like you, I can see value in both. Edit would > > be great if you screw it up and notice right away. Delete would be great > > for those jobs we know are done but have this huge TTL on them that is > > sucking up ATS performance unnecessarily. I know, I'm overthinking this. > > If others are good with edit/delete of jobs, I'm good. Maybe on > > edit/delete, the UI just needs some sort of warning "you realize you are > > editing/deleting a job that may have already been processed. continue?" > > > > Jeremy > > > > > > > > On Thu, Aug 1, 2019 at 7:38 AM ocket 8888 <[email protected]> wrote: > > > > > Do jobs not run constantly for their TTL? I guess I just assumed that a > > > revalidation would remain active until it's over, meaning that matching > > > content can't be cached in that duration. But I suppose that would be > > > unnecessary if content had just changed and wasn't constantly in that > > > window. > > > Still, though, that should just change what can be fixed in that > window. > > > You can't change the fact that cache servers might unnecessarily do a > lot > > > of work to revalidate content that hasn't changed, but if you forget to > > > e.g. make the TTL the same length as the Cache-Control-Max-Age header > then > > > you can still fix it. > > > > > > I'll take out the PATCH method immediately since there seems to be > > > consensus that it's not a good idea at the moment, but I'd still like > to > > > wait a bit to see if anyone else wants to chime in on PUT, since I'm > still > > > convinced editing jobs could be useful. > > > > > > > > > On Wed, Jul 31, 2019 at 10:49 AM Jeremy Mitchell < > [email protected]> > > > wrote: > > > > > > > > the most common runtime for a job is 178 hours, and the vast > majority > > > are > > > > at least 48. You effectively have the entire runtime of a job to > "fix" it > > > > if need be > > > > > > > > i believe it is common practice to set the TTL (runtime) of the > > > invalidate > > > > job to line up with the cache control max age value. that way they > can > > > > guarantee that the content is either revalidated OR expires from > cache. > > > > > > > > however, in practice, if the delivery service is very active (lots of > > > > requests), the content could be revalidated in minutes? across the > whole > > > > cdn so i don't think its true that you "effectively have the entire > > > runtime > > > > of a job to "fix" it if need be" > > > > > > > > i think that's why we've never had edit/delete because once the job > is > > > > created and deployed to the cache (used to be every 15 minutes but > now is > > > > every 1 minute), the job is out there running. not saying i don't > agree > > > > with the ability or the need to edit/delete. i'm just saying it's > tricky. > > > > > > > > > > > > > > > > On Wed, Jul 31, 2019 at 10:33 AM ocket8888 <[email protected]> > wrote: > > > > > > > > > I should also mention that in both PUT and PATCH, the only mutable > > > parts > > > > > of a job are the regular expression, the TTL and the start time. > Which > > > > > is another point I should make regarding 'you only have 60 seconds > to > > > > > edit/delete a job', because actually the start time must be in the > > > > > future, and could be set up to (but using the user/current/jobs > > > > > endpoint, no more than) two days in advance. > > > > > > > > > > On 7/31/19 10:12 AM, Chris Lemmons wrote: > > > > > > While I see the value in PATCH, Rawlin is spot on: we need > defined > > > > > > behaviour around null and missing fields in the patches. (One > > > > > > alternative: jsonpatch. It's more verbose, but clearly defines > the > > > > > > edge cases.) > > > > > > > > > > > > PATCH is also very dangerous unless you support If-Match, which > we > > > > > > don't. But that's a problem we should also fix everywhere. It's > not > > > > > > unique to this endpoint. > > > > > > > > > > > > On Tue, Jul 30, 2019 at 4:49 PM Rawlin Peters < > > > [email protected] > > > > > > > > > > wrote: > > > > > >> In my opinion, introducing PATCH methods seems like unnecessary > > > > > >> complexity. We don't really have a good way in TO-Go to support > > > > > >> partial object updates in a holistic manner today, and there are > > > some > > > > > >> difficulties around determining which fields were actually sent > by a > > > > > >> client with a null value (e.g. `"foo": null`) vs fields that > were > > > > > >> entirely omitted by the client. It would also add to the burden > of > > > > > >> testing and maintenance (when a simple PUT implementation would > > > > > >> suffice), and I don't think there's a great way for the TO Go > client > > > > > >> to marshal a lib/go-tc struct into a PATCH request that only > > > contains > > > > > >> the fields you'd like to update (sometimes with null/empty > values). > > > > > >> > > > > > >> As for PUT, I think we could get by with a POST and a DELETE > > > without a > > > > > >> PUT for this particular endpoint, but I'm not sure I really feel > > > > > >> strongly about that. Providing the ability to PUT kind of > encourages > > > > > >> the idea that you don't really have to get your invalidations > right > > > > > >> the first time, or that you can just update an existing > invalidation > > > > > >> job to change the regex instead of creating a new invalidation > with > > > a > > > > > >> different regex (when really they should be created as separate > > > jobs). > > > > > >> If you have a bad revalidation deployed, your first priority > should > > > > > >> probably be to get rid of it as quickly as possible (via DELETE) > > > > > >> instead of trying to replace it with a different regex (via > PUT). In > > > > > >> that case, I'd think it would be advantageous to only provide > the > > > > > >> DELETE option instead of both DELETE and PUT. First delete the > bad > > > > > >> invalidation ASAP, then work on a better regex. > > > > > >> > > > > > >> - Rawlin > > > > > >> > > > > > >> On Tue, Jul 30, 2019 at 10:31 AM ocket8888 <[email protected] > > > > > > wrote: > > > > > >>> I have had this PR open for a while: > > > > > >>> https://github.com/apache/trafficcontrol/pull/3744 > > > > > >>> > > > > > >>> I meant to bring this to the mailing list earlier, but I > forgot :P > > > > > >>> > > > > > >>> The reason this merits discussion is that the PR adds several > > > method > > > > > >>> handlers to the /jobs endpoint that didn't exist in Perl: > > > > > >>> > > > > > >>> - POST > > > > > >>> > > > > > >>> lets users create new jobs directly at this endpoint. My > hope > > > > is > > > > > >>> that the /user/current/jobs endpoint will fall into disuse, > and we > > > > can > > > > > >>> consolidate some functionality in one place. Obviously, this > > > > > >>> necessitates a CDN-wide queue of reval updates. > > > > > >>> > > > > > >>> - PUT > > > > > >>> > > > > > >>> allows jobs to be replaced. This queues reval updates > > > CDN-wide. > > > > > >>> > > > > > >>> - PATCH > > > > > >>> > > > > > >>> allows jobs to be edited. This also queues reval updates > > > > CDN-wide > > > > > >>> > > > > > >>> - DELETE > > > > > >>> > > > > > >>> deletes jobs. This, too, queues reval updates CDN-wide > > > > > >>> > > > > > >>> > > > > > >>> Which I think is a good idea. Without any way to mutate created > > > > jobs, a > > > > > >>> typo can have dire consequences that can't be taken back. But > since > > > > > >>> POST->DELETE->POST is really just editing with more steps, a > > > > PUT/PATCH > > > > > >>> seemed to make sense. > > > > > >>> > > > > > >>> > > > > > >>> thoughts? > > > > > >>> > > > > > > > > > > > > >
