Re: Backup Cache Group Selection API Format

2018-03-21 Thread Vijay Anand
Here is the PR for the initial version of the code changes discussed for
https://github.com/apache/incubator-trafficcontrol/issues/1907

https://github.com/apache/incubator-trafficcontrol/pull/2029

Did minimal testing. Hosting it before testing it fully to get early review
comments.

Thanks,
Vijayanand S

On Tue, Mar 20, 2018 at 9:51 PM, Rawlin Peters <rawlin.pet...@gmail.com>
wrote:

> I think we should do a few things:
> 1. POST: receive a full list of ordered backups and create the
> fallback list. If the fallback list already exists, render a failure.
> 2. PUT: receive a full list of ordered backups, delete the existing
> fallback list, and create a new fallback list. If the fallback list
> does not exist yet, render a failure.
> 3. DELETE: delete the existing fallback list for a cachegroup. If the
> fallback list does not exist, render a failure.
> 4. GET: return the existing fallback list for a cachegroup. If the
> fallback list does not exist, render a failure.
>
> Then the client behavior would be:
> GET the current fallbacks for a cachegroup. If the list exists, make
> updates and PUT the entire updated list. If the list does not exist,
> create a list and POST the entire list. To empty the list, use the
> DELETE endpoint (or do a PUT with an empty list).
>
> Does that sound good? Right now it doesn't make sense for a client to
> do a PUT for just a single fallback entry, unless we add some sort of
> insert/append/prepend PUT APIs which would seem superfluous.
>
> - Rawlin
>
>
>
> On Mon, Mar 19, 2018 at 11:29 PM, Vijay Anand
> <vijayanand.jayaman...@gmail.com> wrote:
> > Shall i take this format  :[{"name": "grp1", "order": 1},{"name": "grp3",
> > "order: 2}]
> >
> > @Rawlin
> >>Perhaps the POST/PUT endpoints are basically the same and always take the
> > full list of backups.
> >
> > Wanted to make sure the behavior of PUT/ POST endpoints; POST will delete
> > an existing set of backups and insert the new ones. While PUT will update
> > them.
> >
> > For example, I have grp2(1) and grp3(2) as backups for grp1:
> >
> > Case #1: I get a PUT with grp2 (3), I will update grp2 with new order 3.
> > grp1 still has two backup cache groups.
> >
> > Case #2: I get a POST with gr2(3), I will delete all the existing entries
> > and insert grp2  making grp2 as the only backup for grp1 with order 3.
> >
> > Isnt that right?
> >
> > -Vijayanand S
> >
> > On Tue, Mar 20, 2018 at 8:06 AM, Chris Lemmons <alfic...@gmail.com>
> wrote:
> >
> >> You can keep each object without the order parameter:
> >>
> >> [{"name": "grp1"}, {"name":"grp3"}]
> >>
> >> Nevertheless, it sounds like I'm a minority opinion on this one. It's
> >> not that important of an issue, I think. Either way will work.
> >>
> >> On Mon, Mar 19, 2018 at 5:50 PM, Rawlin Peters <rawlin.pet...@gmail.com
> >
> >> wrote:
> >> > Yeah, maybe it doesn't make sense to update just one backup entry at a
> >> > time because you'd have to update the order of the rest of the backups
> >> > as well. Perhaps the POST/PUT endpoints are basically the same and
> >> > always take the full list of backups. But I think we should still keep
> >> > each entry as an object with an explicit "order" so that we can extend
> >> > it more easily in the future. I could easily see us adding a "weight"
> >> > to each entry, so that a particular cachegroup could split the
> >> > fallback traffic between multiple cachegroups at a time rather than
> >> > falling back through them sequentially.
> >> >
> >> > - Rawlin
> >> >
> >> > On Mon, Mar 19, 2018 at 2:30 PM, Chris Lemmons <alfic...@gmail.com>
> >> wrote:
> >> >> So, to continue the conversation, it looks like the list of backup
> >> >> groups is stored as a List. It's currently loaded by
> iterating
> >> >> the elements of the json array in order. That looks great to me.
> >> >>
> >> >> It seems odd to me to have a separate order parameter in the API.
> >> >> Since order has to be unique, it's unlikely that you'd be able to
> >> >> update the order component to do anything other than "move to an end"
> >> >> without updating about half the rows in the db anyway. It just feels
> >> >> like we're asking more of the API user.
> >> >>
> >> 

Re: Backup Cache Group Selection API Format

2018-03-19 Thread Vijay Anand
Shall i take this format  :[{"name": "grp1", "order": 1},{"name": "grp3",
"order: 2}]

@Rawlin
>Perhaps the POST/PUT endpoints are basically the same and always take the
full list of backups.

Wanted to make sure the behavior of PUT/ POST endpoints; POST will delete
an existing set of backups and insert the new ones. While PUT will update
them.

For example, I have grp2(1) and grp3(2) as backups for grp1:

Case #1: I get a PUT with grp2 (3), I will update grp2 with new order 3.
grp1 still has two backup cache groups.

Case #2: I get a POST with gr2(3), I will delete all the existing entries
and insert grp2  making grp2 as the only backup for grp1 with order 3.

Isnt that right?

-Vijayanand S

On Tue, Mar 20, 2018 at 8:06 AM, Chris Lemmons  wrote:

> You can keep each object without the order parameter:
>
> [{"name": "grp1"}, {"name":"grp3"}]
>
> Nevertheless, it sounds like I'm a minority opinion on this one. It's
> not that important of an issue, I think. Either way will work.
>
> On Mon, Mar 19, 2018 at 5:50 PM, Rawlin Peters 
> wrote:
> > Yeah, maybe it doesn't make sense to update just one backup entry at a
> > time because you'd have to update the order of the rest of the backups
> > as well. Perhaps the POST/PUT endpoints are basically the same and
> > always take the full list of backups. But I think we should still keep
> > each entry as an object with an explicit "order" so that we can extend
> > it more easily in the future. I could easily see us adding a "weight"
> > to each entry, so that a particular cachegroup could split the
> > fallback traffic between multiple cachegroups at a time rather than
> > falling back through them sequentially.
> >
> > - Rawlin
> >
> > On Mon, Mar 19, 2018 at 2:30 PM, Chris Lemmons 
> wrote:
> >> So, to continue the conversation, it looks like the list of backup
> >> groups is stored as a List. It's currently loaded by iterating
> >> the elements of the json array in order. That looks great to me.
> >>
> >> It seems odd to me to have a separate order parameter in the API.
> >> Since order has to be unique, it's unlikely that you'd be able to
> >> update the order component to do anything other than "move to an end"
> >> without updating about half the rows in the db anyway. It just feels
> >> like we're asking more of the API user.
> >>
> >> On Mon, Mar 19, 2018 at 2:04 PM, Chris Lemmons 
> wrote:
> >>> Moving a conversation started in Slack to the mailing list:
> >>>
> >>> VijayAnand [8:06 AM]
> >>> Hi
> >>>
> >>> This is regarding TR's Backup Cache Group Selection
> >>> which is https://github.com/apache/incubator-trafficcontrol/
> issues/1907
> >>> GitHub
> >>> Deterministic Cachegroup failover · Issue #1907 ·
> >>> apache/incubator-trafficcontrol
> >>> Currently, if all caches in a cache group are unavailable Traffic
> >>> Router will route clients to the next closest cache group. This works
> >>> great for some networks but could cause problems in others. T...
> >>> Based on Rawlin's review comments on the PR
> >>> https://github.com/apache/incubator-trafficcontrol/pull/1908
> >>> GitHub
> >>> Changes for Backup Edge Cache Group by Vijay-1 · Pull Request #1908 ·
> >>> apache/incubator-trafficcontrol
> >>> This PR implements solution for the issue: #1907 It places the backup
> >>> policy in the CZF file { coverageZones: {
> >>> GROUP2: { backupList: [GROUP1],
> >>> 
> >>> I have the following Schema to support this which is in-sync with
> >>> Rawlin's comments
> >>> CREATE TABLE cachegroup_fallbacks (
> >>>primary_cg bigint,
> >>>backup_cg bigint CHECK (primary_cg != backup_cg),
> >>>set_order bigint DEFAULT 0,
> >>>CONSTRAINT fk_primary_cg FOREIGN KEY (primary_cg) REFERENCES
> >>> cachegroup(id) ON DELETE CASCADE,
> >>>CONSTRAINT fk_backup_cg FOREIGN KEY (backup_cg) REFERENCES
> >>> cachegroup(id) ON DELETE CASCADE,
> >>>UNIQUE (primary_cg, backup_cg),
> >>>UNIQUE (primary_cg, set_order)
> >>> );
> >>>
> >>> ALTER TABLE cachegroup ADD COLUMN fallback_to_closest BOOLEAN DEFAULT
> TRUE;
> >>> Would like to get your views before i start coding for the same
> >>>
> >>> Eric Friedrich [8:15 AM]
> >>> why does the set_order get a default?
> >>>
> >>> VijayAnand [8:17 AM]
> >>> aah. that is not needed
> >>> assuming there is a valid input
> >>> for order
> >>>
> >>> Rawlin Peters [9:15 AM]
> >>> yeah a null value there could be interpreted as zero in the API, but
> >>> maybe we should add a NOT NULL constraint to the columns as well? I
> >>> can't think of any reasons why any column should be optional
> >>>
> >>> Eric Friedrich [9:18 AM]
> >>> any preference in the API between
> >>> ```{"list": ["grp1", "grp2"]}```
> >>> vs
> >>>  ```{"list": [{"name": "grp1", "order": 1},
> >>>  {"name": "grp3", "order: 2},// or 5 or some other
> >>> positive integer
> >>> ]
> >>> }```
> >>>
> >>> Rawlin Peters [9:36 AM]
> >>> 2nd option 

Re: Backup Cache Group Selection

2018-03-13 Thread Vijay Anand
@Rawlin,

Let me try and get the changes done for TR according to your suggestions.
This change would be like:
1. CZF only contains cache groups which should map back to TO's Cache Group
configurations (cr-config)
2. Backup configurations should reach TR via cr-config in the format you
detailed.
3. fallbackToClosest will be True by default. If backupLocation config is
present, it will be assumed as false unless otherwise it is stated as TRUE
explicitly.
4. This will work irrespective of the coverage Zones in CZF as long as the
backup Cache Group specified is in cr-config.

I have a doubt in this as well.

What happens when Geo Limit is set to "CZF Only"  with all backup Caches
are unavailable and fallbackToClosest is set to True. Current
implementation will fail this. Should we do Geo lookup now in this change?

Shall i delete my existing PR and create a new one with these changes?

I will try to get the necessary changes for TO (Perl Mojo) along with this.
Would require your help in TO (Golang) and TP. Will keep you posted.

Thanks,
Vijayanand S




On Tue, Mar 13, 2018 at 3:04 AM, Rawlin Peters 
wrote:

> If we start by putting this in the Cache Group API first, then later
> we really only have to worry about adding the CIDRs to the API. The
> backup config is really just relationships between cache groups, which
> makes perfect sense to model in a relational DB rather than the CZF.
> Why put something in the CZF to just tear it out later?
>
> - Rawlin
>
> On Mon, Mar 12, 2018 at 3:12 PM, Dave Neuman  wrote:
> > Good point Rawlin, but I think it does answer your questions.  CZF for
> now,
> > whatever the new CZF thing is after that.
> >
> > On Mon, Mar 12, 2018 at 1:44 PM, Rawlin Peters 
> > wrote:
> >
> >> The original scope of this thread was determining where the Backup
> >> Cache Group config should live (API vs CZF), not necessarily about
> >> building the entire CZF in the database, although I'm +1 on that idea
> >> as well. I think any decisions made about doing that should probably
> >> be started in a separate thread.
> >>
> >> - Rawlin
> >>
> >> On Mon, Mar 12, 2018 at 1:11 PM, Dave Neuman  wrote:
> >> > +1 on building the CZF in the database.  Jan tried to go down that
> rabbit
> >> > hole but realized it was a pretty hard problem to solve.  I think
> this is
> >> > something we might want to re-visit.  Maybe this is something we
> should
> >> > discuss at our meetup and then update this thread with our decisions?
> >> >
> >> > On Mon, Mar 12, 2018 at 11:25 AM, Rawlin Peters <
> rawlin.pet...@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> @VijayAnand:
> >> >>
> >> >> Right, a Coverage Zone that doesn't map to a Cache Group in TO won't
> >> >> be chosen as a backup in case of failure, but you could have a
> >> >> Coverage-Zone-not-in-TO that configures Coverage-Zones-in-TO as
> >> >> backups. But, I think the general sentiment right now is that all
> >> >> Coverage Zones in the CZF should map back to Cache Groups in TO, so
> >> >> the backup config should also be done via the Cache Group API.
> >> >>
> >> >> So from the Traffic Router perspective, the process should become:
> >> >> 1. Rather than parsing from the CZF into the NetworkNode class, parse
> >> >> Cache Group backup config from the CRConfig into the existing
> >> >> CacheLocation class
> >> >> 2. in the DS request flow, the NetworkNode will map back to a
> >> >> registered CacheLocation which would contain the backup CG config
> >> >>
> >> >> The rest of the PR's behavior should stay the same, it's just a
> matter
> >> >> of the config being located in a different class. To give you an idea
> >> >> of how I would format it in the CRConfig (so you know how to parse it
> >> >> out), here is a snippet of "edgeLocations" from CRConfig.json:
> >> >>
> >> >> "edgeLocations": {
> >> >> "edge-cg-1": {
> >> >>   "latitude": 1.00,
> >> >>   "longitude": 2.00,
> >> >>   "backupLocations": {
> >> >>   "list": ["edge-cg-2"],
> >> >>   "fallbackToClosest": false
> >> >>   }
> >> >> },
> >> >> "edge-cg-2": {
> >> >>   "latitude": 3.00,
> >> >>   "longitude": 4.00
> >> >> },
> >> >> }
> >> >>
> >> >> The "backupLocations" section would still remain optional (if
> missing,
> >> >> follow existing behavior of falling back to next closest CG).
> Existing
> >> >> defaults in the current PR should remain the same.
> >> >>
> >> >> How would you feel about making those changes in your PR? Feel free
> to
> >> >> tackle the new TO API and Traffic Portal changes too if you want, but
> >> >> I don't want to burden you with this unexpected work if you don't
> want
> >> >> it. I (or another willing contributor) could work on the necessary TO
> >> >> API and Traffic Portal changes sometime in the near future and
> >> >> integrate them with your Traffic Router enhancement.
> >> >>
> >> >> - Rawlin
> >> >>
> >> >>
> >> >>